[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated MAPREDUCE-5260: Attachment: MAPREDUCE-5260.patch The root cause of JvmManager running into inconsistent state is TaskTracker lack of user information: 2013-05-14 07:01:31,482 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_201305100625_20199_m_000431_0 2013-05-14 07:01:31,485 INFO org.apache.hadoop.mapred.TaskController: Reading task controller config from /etc/hadoop/taskcontroller.cfg 2013-05-14 07:01:31,485 INFO org.apache.hadoop.mapred.TaskController: User zhaoyunjiong not found 2013-05-14 07:01:31,485 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.io.IOException: Problem signalling task 30048 with TERM; exit = 255 at org.apache.hadoop.mapred.LinuxTaskController.signalTask(LinuxTaskController.java:319) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:555) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvmRunner(JvmManager.java:317) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvm(JvmManager.java:297) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.taskKilled(JvmManager.java:289) at org.apache.hadoop.mapred.JvmManager.taskKilled(JvmManager.java:158) at org.apache.hadoop.mapred.TaskRunner.kill(TaskRunner.java:801) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3279) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:3251) at org.apache.hadoop.mapred.TaskTracker.purgeTask(TaskTracker.java:2286) at org.apache.hadoop.mapred.TaskTracker.markUnresponsiveTasks(TaskTracker.java:2185) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1862) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2646) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3900) This patch catch IOException throwed by LinuxTaskController to prevent inconsistent state. Also it make sure TT will shutdown itself when running into inconsistent state. Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed
[ https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661816#comment-13661816 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5199: bq. The issue stems from conf.getCredentials().addAll(credentials). Conf is a JobConf, and credentials is obtained via the login UGI. These credentials include the app token so by propagating them into the jobConf, the tasks acquire the app token. I guess you are referring to such call in MRAppMaster. That conf is *never* propagated to the tasks, like I said before. The conf that tasks see is the one written out by client. I still don't understand the problem. Please share logs or stack traces or a test-case that fails. I quickly wrote up a patch for YARN-701 and modified TestMRJobs and SleepJob to print out all credentials - ApplicationToken never goes through to the task like I suspected, either via UGI or the conf. AppTokens file can/should be removed Key: MAPREDUCE-5199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Affects Versions: 3.0.0, 2.0.5-beta Reporter: Vinod Kumar Vavilapalli Assignee: Daryn Sharp Priority: Blocker Attachments: MAPREDUCE-5199.patch All the required tokens are propagated to AMs and containers via startContainer(), no need for explicitly creating the app-token file that we have today.. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Tryuber updated MAPREDUCE-3859: -- Release Note: Fixed wrong CapacityScheduler resource allocation for high memory consumption jobs Status: Patch Available (was: Open) Fix is for MR1 only. Test + fix is in the patch. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Tryuber updated MAPREDUCE-3859: -- Attachment: MAPREDUCE-3859_MR1_fix_and_test.patch.txt testcase and fix CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661831#comment-13661831 ] Hadoop QA commented on MAPREDUCE-3859: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12583806/MAPREDUCE-3859_MR1_fix_and_test.patch.txt against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3653//console This message is automatically generated. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661838#comment-13661838 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Arun, I've attached patch for branch-1 with testcase and fix (thanks for pointing me to the right branch). happy to help with YARN/trunk if you want. - yes, please. You know, I had troubles with understanding of test cases of YARN version of CS. I'm not sure about correctness of testing architecture, where there is one huge capacity scheduler configuration with lots of queues. This scheduler configuration is created at the beginning of each test by Before method and each test uses that configuration. I think this is not a good choice, because it doesn't allow to test edge cases and hard for understanding (there are no comments at all)). So please, could you help me and take care about fix for YARN. P.S. Hardcored mocks are great, but, personally, I'd prefer old school with inversion of control (strategy pattern) and agile architecture. CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated MAPREDUCE-5260: Attachment: MAPREDUCE-5260-branch-1.1.patch Update patch name for branch 1.1. Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260-branch-1.1.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated MAPREDUCE-5260: Attachment: (was: MAPREDUCE-5260.patch) Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260-branch-1.1.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661893#comment-13661893 ] Hudson commented on MAPREDUCE-5257: --- Integrated in Hadoop-Yarn-trunk #215 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/215/]) MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. Contributed by Omkar Vinit Joshi. (Revision 1484349) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java TestContainerLauncherImpl fails --- Key: MAPREDUCE-5257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Fix For: 2.0.5-beta Attachments: MAPREDUCE-5257-20130516.patch TestContainerLauncherImpl is hanging and eventually being killed by the surefire timeout which fails a maven test build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661965#comment-13661965 ] Hudson commented on MAPREDUCE-5257: --- Integrated in Hadoop-Hdfs-trunk #1404 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1404/]) MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. Contributed by Omkar Vinit Joshi. (Revision 1484349) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java TestContainerLauncherImpl fails --- Key: MAPREDUCE-5257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Fix For: 2.0.5-beta Attachments: MAPREDUCE-5257-20130516.patch TestContainerLauncherImpl is hanging and eventually being killed by the surefire timeout which fails a maven test build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661983#comment-13661983 ] Hudson commented on MAPREDUCE-5257: --- Integrated in Hadoop-Mapreduce-trunk #1431 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1431/]) MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. Contributed by Omkar Vinit Joshi. (Revision 1484349) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java TestContainerLauncherImpl fails --- Key: MAPREDUCE-5257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Fix For: 2.0.5-beta Attachments: MAPREDUCE-5257-20130516.patch TestContainerLauncherImpl is hanging and eventually being killed by the surefire timeout which fails a maven test build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
[ https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662049#comment-13662049 ] Jason Lowe commented on MAPREDUCE-5261: --- This broke when YARN-617 was integrated. TestRMContainerAllocator is exiting and failing the build - Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony reassigned MAPREDUCE-5260: --- Assignee: zhaoyunjiong Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260-branch-1.1.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662080#comment-13662080 ] Benoy Antony commented on MAPREDUCE-5260: - reviewed. +1 Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260-branch-1.1.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
Jason Lowe created MAPREDUCE-5261: - Summary: TestRMContainerAllocator is exiting and failing the build Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
[ https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662046#comment-13662046 ] Jason Lowe commented on MAPREDUCE-5261: --- Output from the test: {noformat} 2013-05-20 14:32:36,231 DEBUG [AsyncDispatcher event handler] security.BaseContainerTokenSecretManager (BaseContainerTokenSecretManager.java:createPassword(130)) - Creating password for container_1369060353300_0001_01_01 for user container_1369060353300_0001_01_01 (auth:SIMPLE) to be run on NM amNM:1234 2013-05-20 14:32:36,232 DEBUG [AsyncDispatcher event handler] security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(98)) - Writing ContainerTokenIdentifier to RPC layer: org.apache.hadoop.yarn.security.ContainerTokenIdentifier@108f2ca6 2013-05-20 14:32:36,241 DEBUG [AsyncDispatcher event handler] security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(98)) - Writing ContainerTokenIdentifier to RPC layer: org.apache.hadoop.yarn.security.ContainerTokenIdentifier@108f2ca6 2013-05-20 14:32:36,242 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(137)) - Error in dispatcher thread java.lang.IllegalArgumentException: java.net.UnknownHostException: amNM at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.yarn.util.BuilderUtils.newContainerToken(BuilderUtils.java:281) at org.apache.hadoop.yarn.server.security.BaseContainerTokenSecretManager.createContainerToken(BaseContainerTokenSecretManager.java:202) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:555) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignOffSwitchContainers(FifoScheduler.java:519) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:447) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:376) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:615) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:644) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:92) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:450) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:447) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) at org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:65) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.UnknownHostException: amNM ... 15 more 2013-05-20 14:32:36,244 INFO [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(140)) - Exiting, bbye.. {noformat} TestRMContainerAllocator is exiting and failing the build - Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662151#comment-13662151 ] Alejandro Abdelnur commented on MAPREDUCE-5176: --- I like the idea of annotations to drive checkpointing. As preemption is a YARN feature, wouldn't make sense to have @Preemptable as a YARN annotation and have utils classes that help an AM to do implement such logic? By doing this we could use this in the AM itself to implement AM failover recovery. Thoughts? Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662166#comment-13662166 ] Carlo Curino commented on MAPREDUCE-5176: - I like the idea Alejandro, especially as more and more AMs will come into existence, providing some of this in the common layer of YARN might be interesting, and maybe help factor out some of the basic mechanisms. We do something in this direction for the checkpoint service itself. We posted it in MAPREDUCE-5197, but we are internally are experiencing reusing it for other AMs, so maybe you are right this belongs in some common part of the codebase. I would like to gather some more opinions on this, and try to build consensus before starting to shuffle this patches around. Thoughts on the idea of having some place in YARN (or whatever other common place) to put these annotations and maybe the basic common checkpoint service? Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
[ https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned MAPREDUCE-5261: -- Assignee: Omkar Vinit Joshi Sigh, I wish Jenkins ran these tests as part of commit. If not that, it should've run all tests and reported all failures in the nightly builds. Omkar, can you see if there is a common JIRA for this and fix it? Also, can you run *all* MR tests? Tx. TestRMContainerAllocator is exiting and failing the build - Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662233#comment-13662233 ] Karthik Kambatla commented on MAPREDUCE-5176: - Neat idea to use annotations to capture the operator behavior. +1 to moving it to YARN (may be yarn-common). Also, while at it, was wondering if it would make sense to add an annotation @Stateless? Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662260#comment-13662260 ] Arpit Agarwal commented on MAPREDUCE-5095: -- Hi Hitesh, Thanks for reviewing! abortCalled cannot be a non-static since it is referenced from a static nested class. TestShuffleExceptionCount#testCheckException fails occasionally with JDK7 - Key: MAPREDUCE-5095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Environment: Open JDK7 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 1.3.0 Attachments: MAPREDUCE-5095.patch Original Estimate: 1h Time Spent: 1h Remaining Estimate: 0h The test fails due a test-order dependency that can be violated when running with JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662280#comment-13662280 ] Sandy Ryza commented on MAPREDUCE-5176: --- I'm also a +1 on the idea. For the stateful operators case, it seems like a checkpoint method that gets called before preemption would be useful, right? Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
[ https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662288#comment-13662288 ] Omkar Vinit Joshi commented on MAPREDUCE-5261: -- attaching patch for this.. I have create one hadoop ticket to fix jenkin and make sure it runs all the tests before commit. HADOOP-9580 TestRMContainerAllocator is exiting and failing the build - Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Attachments: MAPREDUCE-5261.patch Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build
[ https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated MAPREDUCE-5261: - Attachment: MAPREDUCE-5261.patch TestRMContainerAllocator is exiting and failing the build - Key: MAPREDUCE-5261 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.5-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Attachments: MAPREDUCE-5261.patch Recent builds are failing because TestRMContainerAllocator is exiting rather than succeeding or failing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662313#comment-13662313 ] Hitesh Shah commented on MAPREDUCE-5191: +1. Committing shortly. TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated MAPREDUCE-5191: --- Resolution: Fixed Fix Version/s: 3.0.0 Release Note: Thanks Ivan. Committed to trunk. Status: Resolved (was: Patch Available) TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Fix For: 3.0.0 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662316#comment-13662316 ] Hitesh Shah commented on MAPREDUCE-5095: [~arpitagarwal] Should have reviewed the whole patch in context. Thanks for the clarification. +1. Will commit shortly. TestShuffleExceptionCount#testCheckException fails occasionally with JDK7 - Key: MAPREDUCE-5095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Environment: Open JDK7 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 1.3.0 Attachments: MAPREDUCE-5095.patch Original Estimate: 1h Time Spent: 1h Remaining Estimate: 0h The test fails due a test-order dependency that can be violated when running with JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah resolved MAPREDUCE-5095. Resolution: Fixed Release Note: Thanks Arpit. Committed to branch-1. TestShuffleExceptionCount#testCheckException fails occasionally with JDK7 - Key: MAPREDUCE-5095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Environment: Open JDK7 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 1.3.0 Attachments: MAPREDUCE-5095.patch Original Estimate: 1h Time Spent: 1h Remaining Estimate: 0h The test fails due a test-order dependency that can be violated when running with JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated MAPREDUCE-5191: --- Release Note: (was: Thanks Ivan. Committed to trunk.) TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Fix For: 3.0.0 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated MAPREDUCE-5095: --- Release Note: (was: Thanks Arpit. Committed to branch-1. ) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7 - Key: MAPREDUCE-5095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Environment: Open JDK7 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 1.3.0 Attachments: MAPREDUCE-5095.patch Original Estimate: 1h Time Spent: 1h Remaining Estimate: 0h The test fails due a test-order dependency that can be violated when running with JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662320#comment-13662320 ] Hitesh Shah commented on MAPREDUCE-5191: Thanks Ivan. Committed to trunk. TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Fix For: 3.0.0 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662321#comment-13662321 ] Hitesh Shah commented on MAPREDUCE-5095: Thanks Arpit. Committed to branch-1. TestShuffleExceptionCount#testCheckException fails occasionally with JDK7 - Key: MAPREDUCE-5095 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.2 Environment: Open JDK7 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 1.3.0 Attachments: MAPREDUCE-5095.patch Original Estimate: 1h Time Spent: 1h Remaining Estimate: 0h The test fails due a test-order dependency that can be violated when running with JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662323#comment-13662323 ] Hudson commented on MAPREDUCE-5191: --- Integrated in Hadoop-trunk-Commit #3769 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3769/]) MAPREDUCE-5191. TestQueue#testQueue fails with timeout on Windows. (Contributed by Ivan Mitic) (Revision 1484575) Result = SUCCESS hitesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484575 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestQueue.java TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Fix For: 3.0.0 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662330#comment-13662330 ] Carlo Curino commented on MAPREDUCE-5176: - Karthik, I think there are several other annotations we might want to think about. @Stateless is one, another one is @PreserveKeyOrder which can express for example that maps are not messing with sort order and would allow a smart runtime to pipeline maps and reduces (and skip shuffling) when the input is sorted. I think this could be a powerful tool to expose opportunities for runtime optimization which are not possible in the general case (unless you know something about the code semantics). If none disagree, I like the idea to have this in a common place (maybe yarn-common as Karthik suggested?). BTW I think we are in a great spot to carry along this conversation, since we have one very specific example of these annotations: @Preemptable for which we have the entire end-to-end usage scenario (all the preemption in mapreduce stuff tracked in MAPREDUCE-5189, MAPREDUCE-5192, MAPREDUCE-5194, MAPREDUCE-5196, MAPREDUCE-5197 and few upcoming ones), and plenty more ideas coming up from people. Ideally I would like to move forward with the @Preemptable one and see it through (so we can evaluate it and learn from it), and in parallel we can initiate a broader (and rightfully longer) conversation around annotations for runtime-optimization. Sandy, what we envisioned for mapreduce is that and advanced user that have a stateful UDF can mark it as @Preemptable and override the default save to checkpoint logic to include the portion of state he/she cares about. Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662339#comment-13662339 ] Karthik Kambatla commented on MAPREDUCE-5176: - bq. what we envisioned for mapreduce is that and advanced user that have a stateful UDF can mark it as @Preemptable and override the default save to checkpoint logic to include the portion of state he/she cares about. While that is perfectly fine for a first draft solution, a user might prefer to get feedback when they annotate @Preemptable without fully understanding the consequences. Assuming @Stateless is also added, the preemption code can warn the user that they have annotated @Preemptable without @Stateless and that they are expected to implement/override the checkpoint logic. Thoughts? Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662348#comment-13662348 ] Carlo Curino commented on MAPREDUCE-5176: - Interesting point, although @Stateless is a stricter condition that we need. It is ok to maintain state as far as it is not semantically required to persist across key boundaries (two common cases are: 1) you have state that you reset or ignore at every new key group, e.g., aggregate group-by key, and 2) you maintain state as an optimization (memoization) but it is not required for correctness. So while @Stateless would guarantee safe to preempt using default checkpointing it is tighter than we need. In general, I would expect a user that tags its code to understand the @Preemptable semantics which is if your code does not depend on state to be preserved across key boundaries you are good to go, otherwise you should carefully override these methods. Preemptable annotations (to support preemption in MR) - Key: MAPREDUCE-5176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Carlo Curino Assignee: Carlo Curino Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch Proposing a patch that introduces a new annotation @Preemptable that represents to the framework property of user-supplied classes (e.g., Reducer, OutputCommiter). The intended semantics is that a tagged class is safe to be preempted between invocations. (this is in spirit similar to the Output Contracts of [Nephele/PACT | https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty
[ https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated MAPREDUCE-5240: -- Resolution: Fixed Status: Resolved (was: Patch Available) I have committed modified patch to 2.0.4.1. Thanks Roman. inside of FileOutputCommitter the initialized Credentials cache appears to be empty --- Key: MAPREDUCE-5240 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.4-alpha Reporter: Roman Shaposhnik Assignee: Vinod Kumar Vavilapalli Priority: Blocker Labels: 2.0.4.1 Fix For: 2.0.5-beta, 2.0.4.1-alpha Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt I am attaching a modified wordcount job that clearly demonstrates the problem we've encountered in running Sqoop2 on YARN (BIGTOP-949). Here's what running it produces: {noformat} $ hadoop fs -mkdir in $ hadoop fs -put /etc/passwd in $ hadoop jar ./bug.jar org.myorg.LostCreds 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used. numberOfSecretKeys: 1 numberOfTokens: 0 .. .. .. 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with state FAILED due to: Job commit failed: java.io.IOException: numberOfSecretKeys: 0 numberOfTokens: 0 at org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {noformat} As you can see, even though we've clearly initialized the creds via: {noformat} job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes()); {noformat} It doesn't seem to appear later in the job. This is a pretty critical issue for Sqoop 2 since it appears to be DOA for YARN in Hadoop 2.0.4-alpha -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed
[ https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662357#comment-13662357 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5199: I just finished another test with debug-delete-delays and wait times: None of the tasks' files - appTokens, containerTokens, appId.tokens files have any of ApplicationTokens for tasks. They are only present in the AM's files. So, back to square one, no idea why oozie workflows would fail. That said, this patch can go in anyways and if it somehow fixes your issue, great. There are couple of other solutions to avoid tasks using the wrong token for the AM-RM connection - like fixing the TokenSelector, but we can pursue that separately to unblock you. Will review the patch in a little while. AppTokens file can/should be removed Key: MAPREDUCE-5199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Affects Versions: 3.0.0, 2.0.5-beta Reporter: Vinod Kumar Vavilapalli Assignee: Daryn Sharp Priority: Blocker Attachments: MAPREDUCE-5199.patch All the required tokens are propagated to AMs and containers via startContainer(), no need for explicitly creating the app-token file that we have today.. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed
[ https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662360#comment-13662360 ] Siddharth Seth commented on MAPREDUCE-5199: --- bq. The issue stems from conf.getCredentials().addAll(credentials). These tokens aren't used to create the LaunchContext for the child. I still don't see how the App Token is leaking into the appTokens file. The changes in the patch are required irrespective of this issue. It'd be very useful to understand what is causing the appToken clobber though. Couple of comments on the patch itself. - Should downloadTokensAndSetupUGI be called as part of intAndStartAppMaster itself, so that jobConf credentials population can be before the init. - Rename downloadTokensAndSetupUGI to something like setupJobTokensAndUGI ? AppTokens file can/should be removed Key: MAPREDUCE-5199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Affects Versions: 3.0.0, 2.0.5-beta Reporter: Vinod Kumar Vavilapalli Assignee: Daryn Sharp Priority: Blocker Attachments: MAPREDUCE-5199.patch All the required tokens are propagated to AMs and containers via startContainer(), no need for explicitly creating the app-token file that we have today.. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662363#comment-13662363 ] Arun C Murthy commented on MAPREDUCE-3859: -- [~sergeant] I've just committed this to branch-1 and branch-1.2, so we'll pick it up for hadoop-1.2.1. I've also help add a test case and add this to trunk/branch-2. Thanks! CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-3859: - Target Version/s: (was: 1.3.0) Fix Version/s: 1.2.1 2.0.5-beta CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Fix For: 2.0.5-beta, 1.2.1 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-3859: - Environment: (was: CDH3u1) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Fix For: 2.0.5-beta, 1.2.1 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-3859: - Status: Open (was: Patch Available) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Fix For: 2.0.5-beta, 1.2.1 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662398#comment-13662398 ] Xi Fang commented on MAPREDUCE-5224: Hi Ivan, I was addressing your fourth comment. I have one question. There are two methods: - /** * Grab the local fs name */ public synchronized String getFilesystemName() throws IOException { if (fs == null) { throw new IllegalStateException(FileSystem object not available yet); } return fs.getUri().toString(); } - /** * Get JobTracker's FileSystem. This is the filesystem for mapred.system.dir. */ FileSystem getFileSystem() { return fs; } I am a little bit confused. I think for getFileSystem() it is clear. We still return the systemDir's file system, so we should change this fs to systemDirFs which I omitted in my previous patch. For getFilesystemName(), what does fs stand for in this context, default fs or systemDir's file system. I guess it denotes the latter one. Right? Thanks JobTracker should allow the system directory to be in non-default FS Key: MAPREDUCE-5224 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get(mapred.system.dir, /tmp/hadoop/mapred/system)); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662400#comment-13662400 ] Xi Fang commented on MAPREDUCE-5224: Sorry for the format! The system changed my text to something else because of the special symbols. JobTracker should allow the system directory to be in non-default FS Key: MAPREDUCE-5224 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get(mapred.system.dir, /tmp/hadoop/mapred/system)); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662407#comment-13662407 ] Ivan Mitic commented on MAPREDUCE-5191: --- Thanks Hitesh! TestQueue#testQueue fails with timeout on Windows - Key: MAPREDUCE-5191 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Fix For: 3.0.0 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, MAPREDUCE-5191.patch Test times out on my machine after 5 seconds always on the below stack: {code} testQueue(org.apache.hadoop.mapred.TestQueue) Time elapsed: 5009 sec ERROR! java.lang.Exception: test timed out after 5000 milliseconds at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330) at sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319) at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117) at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114) at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171) at java.security.SecureRandom.nextBytes(SecureRandom.java:433) at java.security.SecureRandom.next(SecureRandom.java:455) at java.util.Random.nextLong(Random.java:284) at java.io.File.generateFile(File.java:1682) at java.io.File.createTempFile(File.java:1791) at java.io.File.createTempFile(File.java:1828) at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221) at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch
[ https://issues.apache.org/jira/browse/MAPREDUCE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662487#comment-13662487 ] Chris Nauroth commented on MAPREDUCE-5259: -- +1 for the patch. I verified that the test passes on Mac and Windows. Thank you for your contribution, Ivan! TestTaskLog fails on Windows because of path separators missmatch - Key: MAPREDUCE-5259 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: MAPREDUCE-5259.patch Test failure: {noformat} Running org.apache.hadoop.mapred.TestTaskLog Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec FAILURE! testTaskLog(org.apache.hadoop.mapred.TestTaskLog) Time elapsed: 409 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state
[ https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5260 started by zhaoyunjiong. Job failed because of JvmManager running into inconsistent state Key: MAPREDUCE-5260 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.1.2 Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 1.1.3 Attachments: MAPREDUCE-5260-branch-1.1.patch In our cluster, jobs failed due to randomly task initialization failed because of JvmManager running into inconsistent state and TaskTracker failed to exit: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) --- java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192) at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125) at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662578#comment-13662578 ] Arun C Murthy commented on MAPREDUCE-5038: -- [~sandyr] Do you know why we are getting the wrong URLs? old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662608#comment-13662608 ] Sandy Ryza commented on MAPREDUCE-5038: --- From taking a deep look at the CombineFileInputFormat code, as well as copying this code into Hive and running it to see what happens, there doesn't appear to be anything on the MapReduce that could be modifying the authority in the URL that's passed in. So I think the wrong URLs must be coming from Hive. old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662609#comment-13662609 ] Sandy Ryza commented on MAPREDUCE-5038: --- *on the MapReduce side old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662647#comment-13662647 ] Sergey Tryuber commented on MAPREDUCE-3859: --- Thanks, Arun CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Reporter: Sergey Tryuber Assignee: Sergey Tryuber Fix For: 2.0.5-beta, 1.2.1 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS
[ https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662675#comment-13662675 ] Ivan Mitic commented on MAPREDUCE-5224: --- Thanks Xi for addressing the comments! bq. For getFilesystemName(), what does fs stand for in this context, default fs or systemDir's file system. I guess it denotes the latter one. Right? Right, I also see it as a systemDir. JobTracker should allow the system directory to be in non-default FS Key: MAPREDUCE-5224 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch JobTracker today expects the system directory to be in the default file system if (fs == null) { fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() { public FileSystem run() throws IOException { return FileSystem.get(conf); }}); } ... public String getSystemDir() { Path sysDir = new Path(conf.get(mapred.system.dir, /tmp/hadoop/mapred/system)); return fs.makeQualified(sysDir).toString(); } In Cloud like Azure the default file system is set as ASV (Windows Azure Blob Storage), but we would still like the system directory to be in DFS. We should change JobTracker to allow that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira