[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-5260:


Attachment: MAPREDUCE-5260.patch

The root cause of JvmManager running into inconsistent state is TaskTracker 
lack of user information:
2013-05-14 07:01:31,482 INFO org.apache.hadoop.mapred.TaskTracker: About to 
purge task: attempt_201305100625_20199_m_000431_0
2013-05-14 07:01:31,485 INFO org.apache.hadoop.mapred.TaskController: Reading 
task controller config from /etc/hadoop/taskcontroller.cfg
2013-05-14 07:01:31,485 INFO org.apache.hadoop.mapred.TaskController: User 
zhaoyunjiong not found
2013-05-14 07:01:31,485 ERROR org.apache.hadoop.mapred.TaskTracker: Caught 
exception: java.io.IOException: Problem signalling task 30048 with TERM; exit = 
255
at 
org.apache.hadoop.mapred.LinuxTaskController.signalTask(LinuxTaskController.java:319)
at 
org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:555)
at 
org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvmRunner(JvmManager.java:317)
at 
org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvm(JvmManager.java:297)
at 
org.apache.hadoop.mapred.JvmManager$JvmManagerForType.taskKilled(JvmManager.java:289)
at org.apache.hadoop.mapred.JvmManager.taskKilled(JvmManager.java:158)
at org.apache.hadoop.mapred.TaskRunner.kill(TaskRunner.java:801)
at 
org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3279)
at 
org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:3251)
at org.apache.hadoop.mapred.TaskTracker.purgeTask(TaskTracker.java:2286)
at 
org.apache.hadoop.mapred.TaskTracker.markUnresponsiveTasks(TaskTracker.java:2185)
at 
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1862)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2646)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3900)


This patch catch IOException throwed by LinuxTaskController to prevent 
inconsistent state. 
Also it make sure TT will shutdown itself when running into inconsistent state.

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-20 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661816#comment-13661816
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


bq. The issue stems from conf.getCredentials().addAll(credentials). Conf is a 
JobConf, and credentials is obtained via the login UGI. These credentials 
include the app token so by propagating them into the jobConf, the tasks 
acquire the app token.
I guess you are referring to such call in MRAppMaster. That conf is *never* 
propagated to the tasks, like I said before. The conf that tasks see is the one 
written out by client.

I still don't understand the problem.

Please share logs or stack traces or a test-case that fails.

I quickly wrote up a patch for YARN-701 and modified TestMRJobs and SleepJob to 
print out all credentials - ApplicationToken never goes through to the task 
like I suspected, either via UGI or the conf.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Sergey Tryuber (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Tryuber updated MAPREDUCE-3859:
--

Release Note: Fixed wrong CapacityScheduler resource allocation for high 
memory consumption jobs
  Status: Patch Available  (was: Open)

Fix is for MR1 only. Test + fix is in the patch.

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Sergey Tryuber (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Tryuber updated MAPREDUCE-3859:
--

Attachment: MAPREDUCE-3859_MR1_fix_and_test.patch.txt

testcase and fix

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661831#comment-13661831
 ] 

Hadoop QA commented on MAPREDUCE-3859:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583806/MAPREDUCE-3859_MR1_fix_and_test.patch.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3653//console

This message is automatically generated.

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Sergey Tryuber (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661838#comment-13661838
 ] 

Sergey Tryuber commented on MAPREDUCE-3859:
---

Arun, I've attached patch for branch-1 with testcase and fix (thanks for 
pointing me to the right branch). 

happy to help with YARN/trunk if you want. - yes, please. You know, I had 
troubles with understanding of test cases of YARN version of CS. I'm not sure 
about correctness of testing architecture, where there is one huge capacity 
scheduler configuration with lots of queues. This scheduler configuration is 
created at the beginning of each test by Before method and each test uses 
that configuration. I think this is not a good choice, because it doesn't allow 
to test edge cases and hard for understanding (there are no comments at all)). 
So please, could you help me and take care about fix for YARN.

P.S. Hardcored mocks are great, but, personally, I'd prefer old school with 
inversion of control (strategy pattern) and agile architecture.

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-5260:


Attachment: MAPREDUCE-5260-branch-1.1.patch

Update patch name for branch 1.1.

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-5260:


Attachment: (was: MAPREDUCE-5260.patch)

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails

2013-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661893#comment-13661893
 ] 

Hudson commented on MAPREDUCE-5257:
---

Integrated in Hadoop-Yarn-trunk #215 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/215/])
MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. 
Contributed by Omkar Vinit Joshi. (Revision 1484349)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java


 TestContainerLauncherImpl fails
 ---

 Key: MAPREDUCE-5257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
 Fix For: 2.0.5-beta

 Attachments: MAPREDUCE-5257-20130516.patch


 TestContainerLauncherImpl is hanging and eventually being killed by the 
 surefire timeout which fails a maven test build.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails

2013-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661965#comment-13661965
 ] 

Hudson commented on MAPREDUCE-5257:
---

Integrated in Hadoop-Hdfs-trunk #1404 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1404/])
MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. 
Contributed by Omkar Vinit Joshi. (Revision 1484349)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java


 TestContainerLauncherImpl fails
 ---

 Key: MAPREDUCE-5257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
 Fix For: 2.0.5-beta

 Attachments: MAPREDUCE-5257-20130516.patch


 TestContainerLauncherImpl is hanging and eventually being killed by the 
 surefire timeout which fails a maven test build.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5257) TestContainerLauncherImpl fails

2013-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661983#comment-13661983
 ] 

Hudson commented on MAPREDUCE-5257:
---

Integrated in Hadoop-Mapreduce-trunk #1431 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1431/])
MAPREDUCE-5257. Fix issues in TestContainerLauncherImpl after YARN-617. 
Contributed by Omkar Vinit Joshi. (Revision 1484349)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484349
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncherImpl.java


 TestContainerLauncherImpl fails
 ---

 Key: MAPREDUCE-5257
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5257
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
 Fix For: 2.0.5-beta

 Attachments: MAPREDUCE-5257-20130516.patch


 TestContainerLauncherImpl is hanging and eventually being killed by the 
 surefire timeout which fails a maven test build.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662049#comment-13662049
 ] 

Jason Lowe commented on MAPREDUCE-5261:
---

This broke when YARN-617 was integrated.

 TestRMContainerAllocator is exiting and failing the build
 -

 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe

 Recent builds are failing because TestRMContainerAllocator is exiting rather 
 than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony reassigned MAPREDUCE-5260:
---

Assignee: zhaoyunjiong

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662080#comment-13662080
 ] 

Benoy Antony commented on MAPREDUCE-5260:
-

reviewed. +1

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-5261:
-

 Summary: TestRMContainerAllocator is exiting and failing the build
 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe


Recent builds are failing because TestRMContainerAllocator is exiting rather 
than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662046#comment-13662046
 ] 

Jason Lowe commented on MAPREDUCE-5261:
---

Output from the test:

{noformat}
2013-05-20 14:32:36,231 DEBUG [AsyncDispatcher event handler] 
security.BaseContainerTokenSecretManager 
(BaseContainerTokenSecretManager.java:createPassword(130)) - Creating password 
for container_1369060353300_0001_01_01 for user 
container_1369060353300_0001_01_01 (auth:SIMPLE) to be run on NM amNM:1234
2013-05-20 14:32:36,232 DEBUG [AsyncDispatcher event handler] 
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(98)) - 
Writing ContainerTokenIdentifier to RPC layer: 
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@108f2ca6
2013-05-20 14:32:36,241 DEBUG [AsyncDispatcher event handler] 
security.ContainerTokenIdentifier (ContainerTokenIdentifier.java:write(98)) - 
Writing ContainerTokenIdentifier to RPC layer: 
org.apache.hadoop.yarn.security.ContainerTokenIdentifier@108f2ca6
2013-05-20 14:32:36,242 FATAL [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(137)) - Error in 
dispatcher thread
java.lang.IllegalArgumentException: java.net.UnknownHostException: amNM
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at 
org.apache.hadoop.yarn.util.BuilderUtils.newContainerToken(BuilderUtils.java:281)
at 
org.apache.hadoop.yarn.server.security.BaseContainerTokenSecretManager.createContainerToken(BaseContainerTokenSecretManager.java:202)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:555)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignOffSwitchContainers(FifoScheduler.java:519)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:447)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:376)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:615)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:644)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:92)
at 
org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:450)
at 
org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator$MyResourceManager$1.handle(TestRMContainerAllocator.java:447)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
at 
org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:65)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.UnknownHostException: amNM
... 15 more
2013-05-20 14:32:36,244 INFO  [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(140)) - Exiting, bbye..
{noformat}


 TestRMContainerAllocator is exiting and failing the build
 -

 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe

 Recent builds are failing because TestRMContainerAllocator is exiting rather 
 than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662151#comment-13662151
 ] 

Alejandro Abdelnur commented on MAPREDUCE-5176:
---

I like the idea of annotations to drive checkpointing. As preemption is a YARN 
feature, wouldn't make sense to have @Preemptable as a YARN annotation and have 
utils classes that help an AM to do implement such logic? By doing this we 
could use this in the AM itself to implement AM failover recovery. Thoughts?

 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662166#comment-13662166
 ] 

Carlo Curino commented on MAPREDUCE-5176:
-

I like the idea Alejandro, especially as more and more AMs will come into 
existence, providing some of this in the common layer of YARN might be 
interesting, and maybe help factor out some of the basic mechanisms. We do 
something in this direction for the checkpoint service itself. We posted it in 
MAPREDUCE-5197, but we are internally are experiencing reusing it for other 
AMs, so maybe you are right this belongs in some common part of the codebase. 
 I would like to gather some more opinions on this, and try to build consensus 
before starting to shuffle this patches around. 

Thoughts on the idea of having some place in YARN (or whatever other common 
place) to put these annotations and maybe the basic common checkpoint service? 

 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned MAPREDUCE-5261:
--

Assignee: Omkar Vinit Joshi

Sigh, I wish Jenkins ran these tests as part of commit. If not that, it 
should've run all tests and reported all failures in the nightly builds. Omkar, 
can you see if there is a common JIRA for this and fix it?

Also, can you run *all* MR tests? Tx.

 TestRMContainerAllocator is exiting and failing the build
 -

 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi

 Recent builds are failing because TestRMContainerAllocator is exiting rather 
 than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662233#comment-13662233
 ] 

Karthik Kambatla commented on MAPREDUCE-5176:
-

Neat idea to use annotations to capture the operator behavior. +1 to moving it 
to YARN (may be yarn-common). Also, while at it, was wondering if it would make 
sense to add an annotation @Stateless?

 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662260#comment-13662260
 ] 

Arpit Agarwal commented on MAPREDUCE-5095:
--

Hi Hitesh,

Thanks for reviewing! abortCalled cannot be a non-static since it is referenced 
from a static nested class.

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662280#comment-13662280
 ] 

Sandy Ryza commented on MAPREDUCE-5176:
---

I'm also a +1 on the idea.  For the stateful operators case, it seems like a 
checkpoint method that gets called before preemption would be useful, right?

 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662288#comment-13662288
 ] 

Omkar Vinit Joshi commented on MAPREDUCE-5261:
--

attaching patch for this.. I have create one hadoop ticket to fix jenkin and 
make sure it runs all the tests before commit. HADOOP-9580

 TestRMContainerAllocator is exiting and failing the build
 -

 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-5261.patch


 Recent builds are failing because TestRMContainerAllocator is exiting rather 
 than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5261) TestRMContainerAllocator is exiting and failing the build

2013-05-20 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated MAPREDUCE-5261:
-

Attachment: MAPREDUCE-5261.patch

 TestRMContainerAllocator is exiting and failing the build
 -

 Key: MAPREDUCE-5261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.5-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-5261.patch


 Recent builds are failing because TestRMContainerAllocator is exiting rather 
 than succeeding or failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662313#comment-13662313
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


+1. Committing shortly. 

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: Thanks Ivan. Committed to trunk.
   Status: Resolved  (was: Patch Available)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662316#comment-13662316
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


[~arpitagarwal] Should have reviewed the whole patch in context. Thanks for the 
clarification. +1. Will commit shortly. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah resolved MAPREDUCE-5095.


  Resolution: Fixed
Release Note: Thanks Arpit. Committed to branch-1. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5191:
---

Release Note:   (was: Thanks Ivan. Committed to trunk.)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated MAPREDUCE-5095:
---

Release Note:   (was: Thanks Arpit. Committed to branch-1. )

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662320#comment-13662320
 ] 

Hitesh Shah commented on MAPREDUCE-5191:


Thanks Ivan. Committed to trunk.

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5095) TestShuffleExceptionCount#testCheckException fails occasionally with JDK7

2013-05-20 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662321#comment-13662321
 ] 

Hitesh Shah commented on MAPREDUCE-5095:


Thanks Arpit. Committed to branch-1. 

 TestShuffleExceptionCount#testCheckException fails occasionally with JDK7
 -

 Key: MAPREDUCE-5095
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5095
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.2
 Environment: Open JDK7
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5095.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0h

 The test fails due a test-order dependency that can be violated when running 
 with JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662323#comment-13662323
 ] 

Hudson commented on MAPREDUCE-5191:
---

Integrated in Hadoop-trunk-Commit #3769 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3769/])
MAPREDUCE-5191. TestQueue#testQueue fails with timeout on Windows. 
(Contributed by Ivan Mitic) (Revision 1484575)

 Result = SUCCESS
hitesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484575
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestQueue.java


 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662330#comment-13662330
 ] 

Carlo Curino commented on MAPREDUCE-5176:
-

Karthik, I think there are several other annotations we might want to think 
about. @Stateless is one, another one is @PreserveKeyOrder which can express 
for example that maps are not messing with sort order and would allow a smart 
runtime to pipeline maps and reduces (and skip shuffling) when the input is 
sorted. I think this could be a powerful tool to expose opportunities for 
runtime optimization which are not possible in the general case (unless you 
know something about the code semantics). 

If none disagree, I like the idea to have this in a common place (maybe 
yarn-common as Karthik suggested?). 

BTW I think we are in a great spot to carry along this conversation, since we 
have one very specific example of these annotations: @Preemptable for which we 
have the entire end-to-end usage scenario (all the preemption in mapreduce 
stuff tracked in MAPREDUCE-5189, MAPREDUCE-5192, MAPREDUCE-5194, 
MAPREDUCE-5196, MAPREDUCE-5197 and few upcoming ones), and plenty more ideas 
coming up from people. 

Ideally I would like to move forward with the @Preemptable one and see it 
through (so we can evaluate it and learn from it), and in parallel we can 
initiate a broader (and rightfully longer) conversation around annotations for 
runtime-optimization.

Sandy, what we envisioned for mapreduce is that and advanced user that have a 
stateful UDF can mark it as @Preemptable and override the default save to 
checkpoint logic to include the portion of state he/she cares about. 


 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662339#comment-13662339
 ] 

Karthik Kambatla commented on MAPREDUCE-5176:
-

bq. what we envisioned for mapreduce is that and advanced user that have a 
stateful UDF can mark it as @Preemptable and override the default save to 
checkpoint logic to include the portion of state he/she cares about.

While that is perfectly fine for a first draft solution, a user might prefer to 
get feedback when they annotate @Preemptable without fully understanding the 
consequences. Assuming @Stateless is also added, the preemption code can warn 
the user that they have annotated @Preemptable without @Stateless and that they 
are expected to implement/override the checkpoint logic. Thoughts?


 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5176) Preemptable annotations (to support preemption in MR)

2013-05-20 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662348#comment-13662348
 ] 

Carlo Curino commented on MAPREDUCE-5176:
-

Interesting point, although @Stateless is a stricter condition that we need. It 
is ok to maintain state as far as it is not semantically required to persist 
across key boundaries (two common cases are: 1) you have state that you reset 
or ignore at every new key group, e.g., aggregate group-by key, and 2) you 
maintain state as an optimization (memoization) but it is not required for 
correctness. So while @Stateless would guarantee safe to preempt using default 
checkpointing it is tighter than we need. 

In general, I would expect a user that tags its code to understand the 
@Preemptable semantics which is if your code does not depend on state to be 
preserved across key boundaries you are good to go, otherwise you should 
carefully override these methods.



 Preemptable annotations (to support preemption in MR)
 -

 Key: MAPREDUCE-5176
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch


 Proposing a patch that introduces a new annotation @Preemptable that 
 represents to the framework property of user-supplied classes (e.g., Reducer, 
 OutputCommiter). The intended semantics is that a tagged class is safe to be 
 preempted between invocations. 
 (this is in spirit similar to the Output Contracts of [Nephele/PACT | 
 https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf])

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5240) inside of FileOutputCommitter the initialized Credentials cache appears to be empty

2013-05-20 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated MAPREDUCE-5240:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I have committed modified patch to 2.0.4.1. Thanks Roman.

 inside of FileOutputCommitter the initialized Credentials cache appears to be 
 empty
 ---

 Key: MAPREDUCE-5240
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5240
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.4-alpha
Reporter: Roman Shaposhnik
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
  Labels: 2.0.4.1
 Fix For: 2.0.5-beta, 2.0.4.1-alpha

 Attachments: LostCreds.java, MAPREDUCE-5240-20130512.txt, 
 MAPREDUCE-5240-20130513.txt, MAPREDUCE-5240.2.0.4.rvs.patch.txt


 I am attaching a modified wordcount job that clearly demonstrates the problem 
 we've encountered in running Sqoop2 on YARN (BIGTOP-949).
 Here's what running it produces:
 {noformat}
 $ hadoop fs -mkdir in
 $ hadoop fs -put /etc/passwd in
 $ hadoop jar ./bug.jar org.myorg.LostCreds
 13/05/12 03:13:46 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
 longer used.
 numberOfSecretKeys: 1
 numberOfTokens: 0
 ..
 ..
 ..
 13/05/12 03:05:35 INFO mapreduce.Job: Job job_1368318686284_0013 failed with 
 state FAILED due to: Job commit failed: java.io.IOException:
 numberOfSecretKeys: 0
 numberOfTokens: 0
   at 
 org.myorg.LostCreds$DestroyerFileOutputCommitter.commitJob(LostCreds.java:43)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:249)
   at 
 org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:212)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}
 As you can see, even though we've clearly initialized the creds via:
 {noformat}
 job.getCredentials().addSecretKey(new Text(mykey), mysecret.getBytes());
 {noformat}
 It doesn't seem to appear later in the job.
 This is a pretty critical issue for Sqoop 2 since it appears to be DOA for 
 YARN in Hadoop 2.0.4-alpha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-20 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662357#comment-13662357
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5199:


I just finished another test with debug-delete-delays and wait times: None of 
the tasks' files - appTokens, containerTokens, appId.tokens files have any of 
ApplicationTokens for tasks. They are only present in the AM's files.

So, back to square one, no idea why oozie workflows would fail.

That said, this patch can go in anyways and if it somehow fixes your issue, 
great.

There are couple of other solutions to avoid tasks using the wrong token for 
the AM-RM connection - like fixing the TokenSelector, but we can pursue that 
separately to unblock you.

Will review the patch in a little while.

 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5199) AppTokens file can/should be removed

2013-05-20 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662360#comment-13662360
 ] 

Siddharth Seth commented on MAPREDUCE-5199:
---

bq. The issue stems from conf.getCredentials().addAll(credentials).
These tokens aren't used to create the LaunchContext for the child. I still 
don't see how the App Token is leaking into the appTokens file.

The changes in the patch are required irrespective of this issue. It'd be very 
useful to understand what is causing the appToken clobber though.

Couple of comments on the patch itself.
- Should downloadTokensAndSetupUGI be called as part of intAndStartAppMaster 
itself, so that jobConf credentials population can be before the init.
- Rename downloadTokensAndSetupUGI to something like setupJobTokensAndUGI ?


 AppTokens file can/should be removed
 

 Key: MAPREDUCE-5199
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5199
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0, 2.0.5-beta
Reporter: Vinod Kumar Vavilapalli
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: MAPREDUCE-5199.patch


 All the required tokens are propagated to AMs and containers via 
 startContainer(), no need for explicitly creating the app-token file that we 
 have today..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662363#comment-13662363
 ] 

Arun C Murthy commented on MAPREDUCE-3859:
--

[~sergeant] I've just committed this to branch-1 and branch-1.2, so we'll pick 
it up for hadoop-1.2.1.

I've also help add a test case and add this to trunk/branch-2. Thanks!

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3859:
-

Target Version/s:   (was: 1.3.0)
   Fix Version/s: 1.2.1
  2.0.5-beta

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
 Environment: CDH3u1
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Fix For: 2.0.5-beta, 1.2.1

 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3859:
-

Environment: (was: CDH3u1)

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Fix For: 2.0.5-beta, 1.2.1

 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3859:
-

Status: Open  (was: Patch Available)

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Fix For: 2.0.5-beta, 1.2.1

 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662398#comment-13662398
 ] 

Xi Fang commented on MAPREDUCE-5224:


Hi Ivan, I was addressing your fourth comment. I have one question.
There are two methods:
- 
 /**
   * Grab the local fs name
   */
  public synchronized String getFilesystemName() throws IOException {
if (fs == null) {
  throw new IllegalStateException(FileSystem object not available yet);
}
return fs.getUri().toString();
  }
 
-
 
  
/**
   * Get JobTracker's FileSystem. This is the filesystem for mapred.system.dir.
   */
  FileSystem getFileSystem() {
return fs;
  }

I am a little bit confused. I think for getFileSystem() it is clear. We still 
return the systemDir's file system, so we should change this fs to systemDirFs 
which I omitted in my previous patch.

For getFilesystemName(), what does fs stand for in this context, default fs or 
systemDir's file system. I guess it denotes the latter one. Right?

Thanks


 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662400#comment-13662400
 ] 

Xi Fang commented on MAPREDUCE-5224:


Sorry for the format! The system changed my text to something else because of 
the special symbols. 

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662407#comment-13662407
 ] 

Ivan Mitic commented on MAPREDUCE-5191:
---

Thanks Hitesh!

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch

2013-05-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662487#comment-13662487
 ] 

Chris Nauroth commented on MAPREDUCE-5259:
--

+1 for the patch.  I verified that the test passes on Mac and Windows.  Thank 
you for your contribution, Ivan!

 TestTaskLog fails on Windows because of path separators missmatch
 -

 Key: MAPREDUCE-5259
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5259.patch


 Test failure:
 {noformat}
 Running org.apache.hadoop.mapred.TestTaskLog
 Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec  
 FAILURE!
 testTaskLog(org.apache.hadoop.mapred.TestTaskLog)  Time elapsed: 409 sec   
 FAILURE!
 junit.framework.AssertionFailedError: null
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertTrue(Assert.java:27)
   at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5260) Job failed because of JvmManager running into inconsistent state

2013-05-20 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5260 started by zhaoyunjiong.

 Job failed because of JvmManager running into inconsistent state
 

 Key: MAPREDUCE-5260
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5260
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.1.2
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Fix For: 1.1.3

 Attachments: MAPREDUCE-5260-branch-1.1.patch


 In our cluster, jobs failed due to randomly task initialization failed 
 because of JvmManager running into inconsistent state and TaskTracker failed 
 to exit:
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
 ---
 java.lang.Throwable: Child Error
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.getDetails(JvmManager.java:402)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:387)
   at 
 org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:192)
   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:125)
   at 
 org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-05-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662578#comment-13662578
 ] 

Arun C Murthy commented on MAPREDUCE-5038:
--

[~sandyr] Do you know why we are getting the wrong URLs?

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, 
 MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, 
 MAPREDUCE-5038-revised.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-05-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662608#comment-13662608
 ] 

Sandy Ryza commented on MAPREDUCE-5038:
---

From taking a deep look at the CombineFileInputFormat code, as well as copying 
this code into Hive and running it to see what happens, there doesn't appear 
to be anything on the MapReduce that could be modifying the authority in the 
URL that's passed in.  So I think the wrong URLs must be coming from Hive.

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, 
 MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, 
 MAPREDUCE-5038-revised.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API

2013-05-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662609#comment-13662609
 ] 

Sandy Ryza commented on MAPREDUCE-5038:
---

*on the MapReduce side

 old API CombineFileInputFormat missing fixes that are in new API 
 -

 Key: MAPREDUCE-5038
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 1.3.0

 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch, 
 MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch, 
 MAPREDUCE-5038-revised.patch


 The following changes patched the CombineFileInputFormat in mapreduce, but 
 neglected the one in mapred
 MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
 MAPREDUCE-2021 solved returning duplicate hostnames in split locations
 MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default 
 FS
 In trunk this is not an issue as the one in mapred extends the one in 
 mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-05-20 Thread Sergey Tryuber (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662647#comment-13662647
 ] 

Sergey Tryuber commented on MAPREDUCE-3859:
---

Thanks, Arun

 CapacityScheduler incorrectly utilizes extra-resources of queue for 
 high-memory jobs
 

 Key: MAPREDUCE-3859
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: capacity-sched
Affects Versions: 1.0.0
Reporter: Sergey Tryuber
Assignee: Sergey Tryuber
 Fix For: 2.0.5-beta, 1.2.1

 Attachments: MAPREDUCE-3859_MR1_fix_and_test.patch.txt, 
 test-to-fail.patch.txt


 Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
 jobs which use 3 map slots will never consume more than 9 slots, regardless 
 how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662675#comment-13662675
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

Thanks Xi for addressing the comments!

bq. For getFilesystemName(), what does fs stand for in this context, default fs 
or systemDir's file system. I guess it denotes the latter one. Right?
Right, I also see it as a systemDir.

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira