date:20120627


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401997#comment-13401997
 ] 

Ahmed Radwan commented on MAPREDUCE-4346:
-

Thanks Arun and Tucu for the comments!

Tucu, I have modified the semantics, so the retired flag doesn't override the 
status filter. Also updated the newly added tests to reflect that.

I have replaced the inner loop by the HashSet per Arun's suggestion too.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Attachment: MAPREDUCE-4346_rev4.patch

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Status: Patch Available  (was: Open)

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

[
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402006#comment-13402006
]

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12533607/MAPREDUCE-4346_rev4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test
files.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2519//console

This message is automatically generated.

Adding a refined version of JobTracker.getAllJobs() and exposing through the
JobClient
--

Key: MAPREDUCE-4346
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch,
MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch

The current implementation for JobTracker.getAllJobs() returns all submitted
jobs in any state, in addition to retired jobs. This list can be long and
represents an unneeded overhead especially in the case of clients only
interested in jobs in specific state(s).
It is beneficial to include a refined version where only jobs having specific
statuses are returned and retired jobs are optional to include.
I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402013#comment-13402013
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

Asking again, what is the use case? I really don't like the api... particularly 
since it's a public api.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402047#comment-13402047
 ] 

Ahmed Radwan commented on MAPREDUCE-4346:
-

As I highlighted in the ticket description above: The JobClient only exposes a 
getAllJobs() which returns all submitted jobs in any state, the result also 
includes all retired jobs. This list is long and represents an unneeded 
overhead especially in the case of clients only interested in jobs in specific 
states. 

One use case is a monitoring service that uses the JobClient and periodically 
calls getAllJobs() to keep track of submitted jobs. Just using the current 
getAllJobs() will represent a communication overhead because the returned list 
is unnecessarily long with redundant information (when called periodically).

The new api provides a way for clients to selectively filter the long list 
which is normally returned by getAllJobs(). The Client can now specify as part 
of the call: the job statuses of interest and if including retired jobs is 
desired or not.

What do you think Arun?

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Attachment: MAPREDUCE-4372-1.patch

 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Status: Open  (was: Patch Available)

 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Status: Patch Available  (was: Open)

Thanks a lot Robert for looking into the patch. I have updated the patch as per 
your suggestion.

 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

[
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402126#comment-13402126
]

Hadoop QA commented on MAPREDUCE-4372:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12533636/MAPREDUCE-4372-1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 javadoc. The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//console

This message is automatically generated.

Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor
and Shutdown hook manager
--

Key: MAPREDUCE-4372
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch,
rm-threaddump.out

Please find the attached resource manager thread dump for the issue.

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402155#comment-13402155
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Hdfs-trunk #1089 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1089/])
MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working 
properly (Jason Lowe via bobby) (Revision 1354181)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354181
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


 mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
 the scheduling of the reduce tasks
 

 Key: MAPREDUCE-4228
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.1
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 0.23.3, 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
 MAPREDUCE-4228.patch


 If no more map tasks need to be scheduled but not all have completed, the 
 ApplicationMaster will start scheduling reducers even if the number of 
 completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
 threshold.  For example, if the property is set to 1.0 all maps should 
 complete before any reducers are scheduled.  However the reducers are 
 scheduled as soon as the last map task is assigned to a container.  For a job 
 with very long-running maps, a cluster with enough capacity to launch all map 
 tasks could cause reducers to launch prematurely and waste cluster resources.
 Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4378) hadoop-validate-setup.sh fails to execute kinit command in secure mode

2012-06-27 Thread Nishan Shetty (JIRA)

Nishan Shetty created MAPREDUCE-4378:


 Summary: hadoop-validate-setup.sh fails to execute kinit command 
in secure mode
 Key: MAPREDUCE-4378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4378
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 3.0.0
 Environment: SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1
Reporter: Nishan Shetty


hadoop-validate-setup.sh is refering to the invalid kinit location.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402198#comment-13402198
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Hdfs-0.23-Build #299 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/299/])
svn merge -c 1354181 FIXES: MAPREDUCE-4228. 
mapreduce.job.reduce.slowstart.completedmaps is not working properly (Jason 
Lowe via bobby) (Revision 1354185)

 Result = UNSTABLE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354185
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


 mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
 the scheduling of the reduce tasks
 

 Key: MAPREDUCE-4228
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.1
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 0.23.3, 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
 MAPREDUCE-4228.patch


 If no more map tasks need to be scheduled but not all have completed, the 
 ApplicationMaster will start scheduling reducers even if the number of 
 completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
 threshold.  For example, if the property is set to 1.0 all maps should 
 complete before any reducers are scheduled.  However the reducers are 
 scheduled as soon as the last map task is assigned to a container.  For a job 
 with very long-running maps, a cluster with enough capacity to launch all map 
 tasks could cause reducers to launch prematurely and waste cluster resources.
 Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

2012-06-27 Thread Avner BenHanoch (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avner BenHanoch updated MAPREDUCE-4049:
---

Attachment: HADOOP-1.x.y-review-oriented.patch

This patch replaces all my previous patches.  It is written in order to ease 
code review, by doing just the minimal changes in existing code.   *I believe 
anyone can verify this patch at glance!*

(my old patches included design enhancements by moving plugins' shared code out 
of ReduceCopier into plugins' base class, and by making ReduceCopier a 
standalone class instead of being inner class of ReduceTask).


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, 
 HADOOP-1.1.patch, HADOOP-1.x.y-review-oriented.patch, Hadoop Shuffle Consumer 
 Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces

[
https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402230#comment-13402230
]

Robert Joseph Evans commented on MAPREDUCE-4377:

John,

that is very true, and if you can fix it I would be very happy to commit it for
you. However, I don't think this is the only place in the code that has
problems with embedded spaces. I'm not saying that we should not fix it, we
should, just be aware that there be monsters here. Also be aware that there
may be some Windows vs. POSIX(bash) issues that you may run into with trying to
parse the arguments. Hopefully not too much though.

TaskRunner javaopts parsing doesn't handle embedded spaces
--

Key: MAPREDUCE-4377
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: task-controller
Affects Versions: trunk
Environment: java options containing escaped or non-escaped embedded
spaces.
Reporter: John Gordon

TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string,
then split is on ' ' and tries to reason on individual options from there.
The problem with this approach is that java options may contain embedded
spaces in many legitimate cases -- this means it is reasoning on incomplete
option strings and cannot do appropriate preprocessing to do things like
handle escape characters or matched quotation marks.

[jira] [Created] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts

Devaraj K created MAPREDUCE-4379:


 Summary: Node Manager throws java.lang.OutOfMemoryError: Java heap 
space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
 Key: MAPREDUCE-4379
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical


{code:xml}
Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap 
space
at java.io.BufferedReader.init(BufferedReader.java:80)
at java.io.BufferedReader.init(BufferedReader.java:91)
at 
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
at 
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
Exception in thread LocalizerRunner for 
container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java heap 
space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.init(String.java:215)
at 
com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
at 
com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at 
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
at 
org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402244#comment-13402244
 ] 

Devaraj K commented on MAPREDUCE-4379:
--

{code:title=ContainerLocalizer.java|borderStyle=solid}
this.appDirs =
  new LocalDirAllocator(String.format(APPCACHE_CTXT_FMT, appId));
this.userDirs =
  new LocalDirAllocator(String.format(USERCACHE_CTXT_FMT, appId));
this.pendingResources = new HashMapLocalResource,FuturePath();
{code}

Here for every application during localization, it creates two 
LocalDirAllocator instances.


{code:title=LocalDirAllocator.java|borderStyle=solid}
  private AllocatorPerContext obtainContext(String contextCfgItemName) {
synchronized (contexts) {
  AllocatorPerContext l = contexts.get(contextCfgItemName);
  if (l == null) {
contexts.put(contextCfgItemName, 
(l = new AllocatorPerContext(contextCfgItemName)));
  }
  return l;
}
  }
{code}

 Those two instances will internally creates AllocatorPerContext instances and 
add those into contexts while obtaining contexts. It will keep on adding for 
every application and no where else these are getting removed from the map. It 
is leading to OOM after running for some time.

 Node Manager throws java.lang.OutOfMemoryError: Java heap space due to 
 org.apache.hadoop.fs.LocalDirAllocator.contexts
 --

 Key: MAPREDUCE-4379
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical

 {code:xml}
 Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap 
 space
   at java.io.BufferedReader.init(BufferedReader.java:80)
   at java.io.BufferedReader.init(BufferedReader.java:91)
   at 
 org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
   at 
 org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
   Exception in thread LocalizerRunner for 
 container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java 
 heap space
   at java.util.Arrays.copyOfRange(Arrays.java:3209)
   at java.lang.String.init(String.java:215)
   at 
 com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
   at 
 com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
   at 
 com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
   at 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
   at 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
   at 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
   at 
 com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
   at 
 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
   at 
 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
   at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
   at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
   at 
 org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
   at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
   at 
 org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:

[jira] [Updated] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4379:
---

 Target Version/s: 0.23.3
Affects Version/s: 0.23.3

I really would like to see this go into 0.23 as well.

 Node Manager throws java.lang.OutOfMemoryError: Java heap space due to 
 org.apache.hadoop.fs.LocalDirAllocator.contexts
 --

 Key: MAPREDUCE-4379
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical

 {code:xml}
 Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap 
 space
   at java.io.BufferedReader.init(BufferedReader.java:80)
   at java.io.BufferedReader.init(BufferedReader.java:91)
   at 
 org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
   at 
 org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
   Exception in thread LocalizerRunner for 
 container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java 
 heap space
   at java.util.Arrays.copyOfRange(Arrays.java:3209)
   at java.lang.String.init(String.java:215)
   at 
 com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
   at 
 com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
   at 
 com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
   at 
 com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
   at 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
   at 
 com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
   at 
 com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
   at 
 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
   at 
 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
   at 
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
   at 
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
   at 
 org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
   at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
   at 
 org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402252#comment-13402252
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Mapreduce-trunk #1122 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1122/])
MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working 
properly (Jason Lowe via bobby) (Revision 1354181)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354181
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


 mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
 the scheduling of the reduce tasks
 

 Key: MAPREDUCE-4228
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.1
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 0.23.3, 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
 MAPREDUCE-4228.patch


 If no more map tasks need to be scheduled but not all have completed, the 
 ApplicationMaster will start scheduling reducers even if the number of 
 completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
 threshold.  For example, if the property is set to 1.0 all maps should 
 complete before any reducers are scheduled.  However the reducers are 
 scheduled as soon as the last map task is assigned to a container.  For a job 
 with very long-running maps, a cluster with enough capacity to launch all map 
 tasks could cause reducers to launch prematurely and waste cluster resources.
 Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402255#comment-13402255
 ] 

Robert Joseph Evans commented on MAPREDUCE-4372:


Changes look good to me +1.

 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4372:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha
   Status: Resolved  (was: Patch Available)

Thanks Devaraj,

I put this into trunk and branch-2.

 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402262#comment-13402262
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2462 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2462/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402264#comment-13402264
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Common-trunk-Commit #2393 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2393/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

[
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402268#comment-13402268
]

Robert Joseph Evans commented on MAPREDUCE-4371:

Just a few comments about the patch.

# the new file needs an apache license comment at the top.
# It would be nice to have a comment in the test about what the test class is
intended to cover.
# The test looks like it is passing, but without any exceptions ever being
caught in the test. The run method catches all exceptions and then kills all
of the jobs. This is because run is intended to potentially be called on its
own thread. Please instead verify that all of the jobs are marked as failed at
the end.
# Inside the patch itself it looks like there are a few places where the
formatting is off. We use 2 spaces for indentation and try to wrap the lines at
under 80 characters.

Other then that it looks good. Also a bit of process in when you upload a
patch please mark the box indicating that it is intended for inclusion in
Apache, also please then hit the submit patch button. This will trigger
Jenkins to try and test the patch against trunk. I am going to hit submit
patch for you, but the checkbox you have to do because it is your code and your
copyright.

Check for cyclic dependencies in Jobcontrol job DAG
---

Key: MAPREDUCE-4371
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv1
Affects Versions: 3.0.0
Reporter: madhukara phatak
Attachments: MAPREDUCE-4371.patch

In current implementation of JobControl, whenever there is a cyclic
dependency between the jobs it throws a Stack overflow exception. This jira
adds a cyclic check to jobcontrol.

[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4371:
---

Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

 Check for cyclic dependencies in Jobcontrol job DAG
 ---

 Key: MAPREDUCE-4371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 3.0.0
Reporter: madhukara phatak
 Attachments: MAPREDUCE-4371.patch


 In current implementation of JobControl, whenever there is a cyclic 
 dependency between the jobs it throws a Stack overflow exception. This jira 
 adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue

2012-06-27 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402271#comment-13402271
 ] 

Jason Lowe commented on MAPREDUCE-4360:
---

This JIRA indicates that trunk is affected, but I believe this has already been 
addressed in trunk (and branch-2 and branch-0.23) by MAPREDUCE-3683.

 Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
 container queue
 -

 Key: MAPREDUCE-4360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.1, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4360-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

[
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402282#comment-13402282
]

Hadoop QA commented on MAPREDUCE-4371:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12533455/MAPREDUCE-4371.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 javadoc. The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//console

This message is automatically generated.

Check for cyclic dependencies in Jobcontrol job DAG
---

In current implementation of JobControl, whenever there is a cyclic
dependency between the jobs it throws a Stack overflow exception. This jira
adds a cyclic check to jobcontrol.

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-27 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402297#comment-13402297
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4326:
---

Sharad, 

MAPREDUCE-2713 is now marked as dup of this ticket(MAPREDUCE-4326).

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402304#comment-13402304
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2412 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2412/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
 and Shutdown hook manager
 --

 Key: MAPREDUCE-4372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
 rm-threaddump.out


 Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread madhukara phatak (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated MAPREDUCE-4371:


Attachment: MAPREDUCE-4371-1.patch

Updated the patch to fix test case and style issues.

 Check for cyclic dependencies in Jobcontrol job DAG
 ---

 Key: MAPREDUCE-4371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 3.0.0
Reporter: madhukara phatak
 Attachments: MAPREDUCE-4371-1.patch, MAPREDUCE-4371.patch


 In current implementation of JobControl, whenever there is a cyclic 
 dependency between the jobs it throws a Stack overflow exception. This jira 
 adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402319#comment-13402319
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

It used to be

job 1, SUCCEEDED, SUCCEEDED
job 2, KILLED, KILLED
job 3, FAILED, FAILED

Now it's getting

job 1, SUCCEEDED, SUCCEEDED
job 2, ERROR, ERROR

The test hangs after job 2. 


 TestClusterMRNotification times out
 ---

 Key: MAPREDUCE-4376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 2.0.1-alpha
Reporter: Jason Lowe

 The TestClusterMRNotification test is often timing out.  git bisect tests 
 narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
 that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-27 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402323#comment-13402323
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4326:
---

Bikas,

What's going on? I can help you if you have a difficulty related to a 
preliminary design sketch.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

[
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402328#comment-13402328
]

Hadoop QA commented on MAPREDUCE-4371:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12533666/MAPREDUCE-4371-1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 1 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 javadoc. The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//console

This message is automatically generated.

Check for cyclic dependencies in Jobcontrol job DAG
---

In current implementation of JobControl, whenever there is a cyclic
dependency between the jobs it throws a Stack overflow exception. This jira
adds a cyclic check to jobcontrol.

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402359#comment-13402359
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

Relevant log entries:

{noformat}
2012-06-27 08:48:55,331 INFO [IPC Server handler 0 on 57856] 
org.apache.hadoop.mapreduce.v2.app.client.MRClie
ntService: Kill Job received from client job_1340812108963_0002
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobI
mpl: job_1340812108963_0002Job Transitioned from RUNNING to KILL_WAIT
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_m_00 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_m_01 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,333 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_r_00 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
AttemptImpl: attempt_1340812108963_0002_m_00_0 TaskAttempt Transitioned 
from UNASSIGNED to KILLED
2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1340812108963_0002_m_01_0 TaskAttempt Transitioned from UNASSIGNED 
to KILLED
2012-06-27 08:48:55,335 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1340812108963_0002_r_00_0 TaskAttempt Transitioned from UNASSIGNED 
to KILLED
2012-06-27 08:48:55,335 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_m_00 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,338 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_m_01 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,338 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_r_00 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 2
2012-06-27 08:48:55,340 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 3
2012-06-27 08:48:55,341 ERROR [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error in handling 
event type CONTAINER_DEALLOCATE to the ContainreAllocator
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$AssignedRequests.get(RMContainerAllocator.java:1103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleEvent(RMContainerAllocator.java:339)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$1.run(RMContainerAllocator.java:191)
2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job 
Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job 
Transitioned from KILLED to ERROR
{noformat}

The code assumes that if the attempt ID is not found in scheduledRequests, it 
will be in assignedRequests. But in this case, it was still in UNASSIGNED.

 TestClusterMRNotification times out
 ---

 Key: MAPREDUCE-4376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 2.0.1-alpha
Reporter: Jason Lowe

 The TestClusterMRNotification test is often timing out.  git bisect tests 
 narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
 that change and times out most of the time after

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402419#comment-13402419
]

Bikas Saha commented on MAPREDUCE-4322:
---

3. My main concern is that we are not differentiating that the first failure is
due to a bad setup string while the second one is due to a bad cmd string.
Since the code is adding the exact failed command into the exception we could
look for setup in the first case and command in the second case in addition
to sb.toString(). I should have been more clear. I didn't literally mean
setup.toString() because its a list :)

Fix command-line length abort issues on Windows
---

Key: MAPREDUCE-4322
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: tasktracker
Environment: Windows, downstream applications with long aggregate
classpaths
Reporter: John Gordon
Assignee: Ivan Mitic
Attachments: MAPREDUCE-4322-branch-1-win(2).patch,
MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch,
MAPREDUCE-4322-branch-1-win.patch

Original Estimate: 12h
Remaining Estimate: 12h

When a task is started on the tasktracker, it creates a small batch file to
invoke java and runs that batch. Within the batch file, the invocation of
Java currently has -classpath ${CLASSPATH} inline to the command. That line
often exceeds 8000 characters. This is ok for most linux distributions
because the line limit env variable is often set much higher than this.
However, for Windows this cause cmd to abort execution. This surfaces in
Hadoop as an unknown failure mode for the task.
I think the easiest and most natural way to fix this is to push the
-classpath option into a config file to take the longest variable part of the
line and put it somewhere that scales better.

[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402426#comment-13402426
 ] 

Mayank Bansal commented on MAPREDUCE-4360:
--

Jason,

I did not realize that it is already fixed in trunk will update the JIRA. 
Thanks for pointing this out.


Konst,

Thats already been done in when tasks been assigned any queue.

Thanks,
Mayank

 Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
 container queue
 -

 Key: MAPREDUCE-4360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.1, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4360-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402444#comment-13402444
]

Ivan Mitic commented on MAPREDUCE-4322:
---

3. Oh, thanks for clarifying. My thinking was, from the user's perspective, we
are outputting the actual command that exceeded the limit. Whether it is setup
or command, it is not as relevant. In unit tests, since I know the code, I want
to cover all cases, so I'm testing both. I am leaning toward keeping the code
as is, given that I wouldn't want to have a hardcoded dependency on what is in
the exception message. Let me know if you feel strong about this.

Fix command-line length abort issues on Windows
---

Original Estimate: 12h
Remaining Estimate: 12h

[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402450#comment-13402450
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

reverted from trunk and branch-2

 Add startTime to RunningJob
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
 to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces

2012-06-27 Thread John Gordon (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402452#comment-13402452
 ] 

John Gordon commented on MAPREDUCE-4377:


Thanks Robert!  I agree it won't be an easy fix and may need some 
rearchitecture and significant test additions.

 TaskRunner javaopts parsing doesn't handle embedded spaces
 --

 Key: MAPREDUCE-4377
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task-controller
Affects Versions: trunk
 Environment: java options containing escaped or non-escaped embedded 
 spaces.
Reporter: John Gordon

 TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string, 
 then split is on ' ' and tries to reason on individual options from there.  
 The problem with this approach is that java options may contain embedded 
 spaces in many legitimate cases -- this means it is reasoning on incomplete 
 option strings and cannot do appropriate preprocessing to do things like 
 handle escape characters or matched quotation marks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (MAPREDUCE-4355) Add startTime to RunningJob


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402450#comment-13402450
 ] 

Alejandro Abdelnur edited comment on MAPREDUCE-4355 at 6/27/12 6:45 PM:


reverted from trunk, branch-2 and branch-1.

  was (Author: tucu00):
reverted from trunk and branch-2
  
 Add startTime to RunningJob
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
 to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4342:
-

Attachment: MAPREDUCE-4342-22-3.patch

Hi Konst,

Thanks for the comments, updated all the comments.

Thanks,
Mayank

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: MAPREDUCE-4342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22-2.patch, 
 MAPREDUCE-4342-22-3.patch, MAPREDUCE-4342-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402464#comment-13402464
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

Ahmed, LGTM, only thing is that status is an int and you are using a set to do 
the filtering, this means that for each comparison an Integer will be created. 
Instead I'd just iterate over the received filter using a helper method 
*boolean filter(int filter[], int status)*.


 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402465#comment-13402465
 ] 

Arun C Murthy commented on MAPREDUCE-4355:
--

bq. Arun, it might be cleaner to add RunningJob.getJobStatus() instead of 
adding startTime, endTime fields to RunningJob and redundantly maintaining them.

+1, good point!

 Add startTime to RunningJob
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
 to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402469#comment-13402469
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2465 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2465/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

[
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402477#comment-13402477
]

Arun C Murthy commented on MAPREDUCE-4346:
--

bq. As I highlighted in the ticket description above: The JobClient only
exposes a getAllJobs() which returns all submitted jobs in any state, the
result also includes all retired jobs. This list is long and represents an
unneeded overhead especially in the case of clients only interested in jobs in
specific states.

Ahmed I'm not convinced. Yes, it's bit more overhead, but I don't see how
adding a new public api is going to make significant difference. IAC, if you
set completed jobs to 0, you'll not get retired jobs. Unless I hear a more
compelling argument I'm -1 on this. Also, please remember that this API is
fairly hard to support with YARN, so that is another problem.

Adding a refined version of JobTracker.getAllJobs() and exposing through the
JobClient
--

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402478#comment-13402478
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

To be clear: we should refrain from adding public apis without a *compelling* 
use-case to MRv1, particularly when they are going to be hard to support in 
MRv2. Thanks.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402479#comment-13402479
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Common-trunk-Commit #2396 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2396/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4360:
-

Affects Version/s: (was: trunk)

 Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
 container queue
 -

 Key: MAPREDUCE-4360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4360-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4360:
-

Attachment: MAPREDUCE-4360-22-1.patch

Thanks Konst for your comments.

Updated the patch with formatting issues.

Thanks,
Mayank

 Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
 container queue
 -

 Key: MAPREDUCE-4360
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.1
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4360-22-1.patch, MAPREDUCE-4360-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4376) TestClusterMRNotification times out


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned MAPREDUCE-4376:
-

Assignee: Kihwal Lee

 TestClusterMRNotification times out
 ---

 Key: MAPREDUCE-4376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Kihwal Lee

 The TestClusterMRNotification test is often timing out.  git bisect tests 
 narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
 that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402489#comment-13402489
]

Bikas Saha commented on MAPREDUCE-4322:
---

Thats exactly what I am saying too :) The test is trying to cover both cases,
but the result is kind of implicit right now because we know both paths are
being covered. However, in the test itself by checking for only sb.toString()
we are not making that explicit. There is nothing to hardcode. Unless I am
reading the test code incorrectly, we have already defined Liststring setup
and Liststring cmd. In the exception message, along with checking for
sb.toString(), we could also check for setup[0] and cmd[0]. That way its
explicit that 2 different paths are being covered.

Fix command-line length abort issues on Windows
---

Original Estimate: 12h
Remaining Estimate: 12h

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402499#comment-13402499
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

There is a check for null to handle transitions from UNASSIGNED state, but the 
check doesn't work anymore because  assignedRequest.get() throws NPE after the 
following change from MAPREDUCE-3921.  

{noformat}
 ContainerId get(TaskAttemptId tId) {
   if (tId.getTaskId().getTaskType().equals(TaskType.MAP)) {
-return maps.get(tId);
+return maps.get(tId).getId();
   } else {
-return reduces.get(tId);
+return reduces.get(tId).getId();
   }
 }
{noformat}

Jason has also suggested we put a time limit in these jobs so that they don't 
hang even if something goes wrong.

 TestClusterMRNotification times out
 ---

 Key: MAPREDUCE-4376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, test
Affects Versions: 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Kihwal Lee

 The TestClusterMRNotification test is often timing out.  git bisect tests 
 narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
 that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-27 Thread Tom White (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402503#comment-13402503
]

Tom White commented on MAPREDUCE-3837:
--

Mayank - thanks for the changes. Here's my feedback:

* If there is no need for restart count anymore - since jobs are re-run from
the beginning each time - then would it be cleaner to remove it entirely?
* In JobTracker you changed shouldRecover = false; to shouldRecover = true;
without updating the comment on the line before. (This might be related to the
previous point about not having restart counts.)
* Remove the @Ignore annotation from TestRecoveryManager and the comment about
MAPREDUCE-873.
* The new test testJobresubmission (should be testJobResubmission) should test
that the job succeeded after the restart. Also, there's no reason to run it as
a high-priority job.
* There's a comment saying it is a faulty job - which it isn't.
* Have setUp and tearDown methods to start and stop the cluster. At the moment
there is code duplication, and clusters won't be shut down cleanly on failure.
* testJobTracker would be better named testJobTrackerRestartsWithMissingJobFile
* testRecoveryManager would be better named testJobTrackerRestartWithBadJobs
* There are multiple typos and formatting errors (including indentation, which
should be 2 spaces) in the new code. See Konstantin's comment above.
* TestJobTrackerRestartWithLostTracker still fails, as does
TestJobTrackerSafeMode. These should be fixed as a part of this work.

Hadoop 22 Job tracker is not able to recover job in case of crash and after
that no user can submit job.

Key: MAPREDUCE-3837
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Mayank Bansal
Assignee: Mayank Bansal
Fix For: 0.24.0, 0.22.1, 0.23.2

Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch,
PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837-3.patch,
PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch,
PATCH-TRUNK-MAPREDUCE-3837.patch

If job tracker is crashed while running , and there were some jobs are
running , so if job tracker's property mapreduce.jobtracker.restart.recover
is true then it should recover the job.
However the current behavior is as follows
jobtracker try to restore the jobs but it can not . And after that jobtracker
closes its handle to hdfs and nobody else can submit job.
Thanks,
Mayank

[jira] [Updated] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ivan Mitic updated MAPREDUCE-4322:
--

Attachment: MAPREDUCE-4322-branch-1-win(5).patch

Attaching updated patch. Adding explicit checks that the correct exception
string is returned back. Also removing some of if WINDOWS forks in the test
code.

Fix command-line length abort issues on Windows
---

Original Estimate: 12h
Remaining Estimate: 12h

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402516#comment-13402516
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2415 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2415/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

[
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402522#comment-13402522
]

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

@Arun,

I'm working with Ahmed on this one.

The use case we have is large clusters running 1000+ concurrent jobs,
monitoring agents are querying the cluster for jobs in different statuses, most
of the times this agents focus on running/just finished jobs. With the current
API we are forced to query ALL jobs, including retired jobs (which increases
significantly the number of jobs being returned), and do the filtering in the
client side. This creates unnecessary load on the JT (serializing all jobs) and
on the client (deserializing all jobs). Thus adding this new API, which does
not break backwards compatibility will definitely help reducing this load.

Regarding the support in MRv2, we currently have a the getAllJobs() method
there as well, we can address it in the client side for sure (the fallback
implementation Ahmed did in the client for MRv1). We could add and PB call to
support the filtering on the RM side. While looking at MRv2 code I've noticed
we are only querying the RM, this means that completed jobs will never be
returned by this call. If I'm correct here, a solution would be for the client
to call the HS to ask for jobs younger than X; this would be the equivalent of
'retired' jobs, and definitely the filtering would be useful as well for the
same reasons explained above.

Adding a refined version of JobTracker.getAllJobs() and exposing through the
JobClient
--

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Description: 
Usecase: Read the start/end-time of a particular job.

Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate 
through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold 
the job's start time.

Adding RunningJob.getJobStatus() solves the issue.

  was:
To read the start-time of a particular job, one should not need to getAllJobs() 
and iterate through them.

getJob(JobID) returns RunningJob, which doesn't hold the job's start time.

Hence, we need to either add getJobStatus(JobID) to the API or add startTime to 
RunningJob. Doing the latter.


Summary: Add RunningJob.getJobStatus()  (was: Add startTime to 
RunningJob)

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr1.patch

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402576#comment-13402576
]

Bikas Saha commented on MAPREDUCE-4322:
---

Thanks for including all comments! +1. lgtm.

Fix command-line length abort issues on Windows
---

Original Estimate: 12h
Remaining Estimate: 12h

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: (was: MR-4355_mr1.patch)

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr2.patch

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: (was: MR-4355_mr2.patch)

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Status: Patch Available  (was: Reopened)

Submitting the MR1 and MR2 patches.

- No tests for MR2 - just added a wrapper call to Job.getStatus()

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 2.0.0-alpha, 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402607#comment-13402607
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4317:
---

Karthik,

Why 'job ==null' ?

{code}
+if (!jt.areACLsEnabled() || job == null) {
+  return myJob;
+}
{code}

If job == null then myJob is also null (or even the call may fail)

Shouldn't we check for job == null before trying to the myJob?



 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch, MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402622#comment-13402622
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

The mr1 patch has a few false changes in the test class, please revert those.

Please add a simple testcase for the mr2 case.

Also, in the mr1 patch you are using 'updateStatus()' to update the jobstatus 
before returning the object. the method above uses 'ensureFreshStatus()', why 
the difference?

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

[
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402623#comment-13402623
]

Hadoop QA commented on MAPREDUCE-4355:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12533712/MR-4355_mr2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//console

This message is automatically generated.

Add RunningJob.getJobStatus()
-

Key: MAPREDUCE-4355
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Fix For: 1.1.0, 2.0.1-alpha

Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch

Usecase: Read the start/end-time of a particular job.
Currently, one has to iterate through JobClient.getAllJobStatuses() and
iterate through them. JobClient.getJob(JobID) returns RunningJob, which
doesn't hold the job's start time.
Adding RunningJob.getJobStatus() solves the issue.

[jira] [Resolved] (MAPREDUCE-4373) Fix Javadoc warnings in JobClient.


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla resolved MAPREDUCE-4373.
-

  Resolution: Won't Fix
Release Note: The changes from MAPREDUCE-4355 have been reverted, and it 
doesn't suffer from the warnings anymore.

 Fix Javadoc warnings in JobClient.
 --

 Key: MAPREDUCE-4373
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4373
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Karthik Kambatla

 It looks like MAPREDUCE-4355 added in two new javadoc warnings.
 {code}
 [WARNING] 
 /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:651:
  warning - @param argument jobid is not a parameter name.
 [WARNING] 
 /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:669:
  warning - @param argument jobid is not a parameter name.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr1.patch

Updated patch for MR1.

ensureFreshStatus() calls updateStatus() only after a particular amount of time 
has passed since previous updateStatus().

For getJobStatus(), to get the latest status, we need to call updateStatus(). 
Do you suggest calling ensureFreshStatus() instead for consistency?

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402640#comment-13402640
 ] 

Hadoop QA commented on MAPREDUCE-4355:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533723/MR-4355_mr1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2524//console

This message is automatically generated.

 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402641#comment-13402641
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

Alejandro, 

The API (Javadoc below) mentions that the job will be null, if there doesn't 
exist a job with that JobID. The old API also has the same functionality.

{code}
  /**
   * Validates if current user can view the job.
   * If user is not authorized to view the job, this method will modify the
   * response and forwards to an error page and returns Job with
   * viewJobAccess flag set to false.
   * @return JobWithViewAccessCheck object(contains JobInProgress object and
   * viewJobAccess flag). Callers of this method will check the flag
   * and decide if view should be allowed or not. Job will be null if
   * the job with given jobid doesnot exist at the JobTracker.
   */
{code}

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch, MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402691#comment-13402691
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

regarding changing updateStatus() to ensureFreshStatus(), no I think 
updateStatus() is more appropriate.


 Add RunningJob.getJobStatus()
 -

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch


 Usecase: Read the start/end-time of a particular job.
 Currently, one has to iterate through JobClient.getAllJobStatuses() and 
 iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
 doesn't hold the job's start time.
 Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-27 Thread Jie Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li resolved MAPREDUCE-4365.
---

  Resolution: Fixed
Target Version/s:   (was: 1.1.0)

One way is to include the profiler library into the job jar and use relative 
path like ../../foo.library to locate it.

Thanks Deveraj, Sid, Vinod and everyone!

 Shipping Profiler Libraries by DistributedCache
 ---

 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li

 Hadoop profiling is great for performance tuning and debugging, but currently 
 we can only use Java built-in profilers such as HProf, and for other 
 profilers we need to install them on all slave nodes first, which is 
 inconvenient for large clusters and sometimes impossible for production 
 clusters. 
 Supporting shipping profiler libraries using DistributedCache will solve this 
 problem. For example, in mapred.task.profile.params, we specify a profiler 
 library from the DistributedCache using special place holders such as 
 foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
 with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows

[
https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402782#comment-13402782
]

Ivan Mitic commented on MAPREDUCE-4374:
---

+1, change looks good to me. Agree on your points for using '%' and ';' on
Windows.

Fix child task environment variable config and add support for Windows
--

Key: MAPREDUCE-4374
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
Attachments: MAPREDUCE-4374-branch-1-win.patch

In HADOOP-2838, a new feature was introduced to set environment variables via
the Hadoop config 'mapred.child.env' for child tasks. There are some further
fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix;
MAPREDUCE-478 broke the config into 'mapred.map.child.env' and
'mapred.reduce.child.env'. However the current implementation is still not
complete. It does not match its documentation or original intend as I
believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration
syntax, we will have problems using them on Windows because ‘:’ appears very
often in Windows path as in “C:\”, and environment variables are used very
often to hold path names. The Jira is created to fix the problem and provide
support on Windows.

[jira] [Commented] (MAPREDUCE-4369) Fix streaming job failures with WindowsResourceCalculatorPlugin

[
https://issues.apache.org/jira/browse/MAPREDUCE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402792#comment-13402792
]

Ivan Mitic commented on MAPREDUCE-4369:
---

Thanks for the change Bikas!

A few questions/suggestions:
1. In {{WindowsResourceCalculatorPlugin#getProcResourceValues()}} you mention
that some tests use JVM_PID. Do you happen to have a list of these tests?
2. Can you please refactor
{{ResourceCalculatorPlugin#getResourceCalculatorPlugin()}} to accept
processPid, and update call sites to pass the appropriate value (I see only 3
call sites). The cause of this bug in the first place is not having all call
sites set the processPid accordingly. And then, if the passed-in processPid is
null, you can fallback to {{System.getenv().get(JVM_PID)}}. Make sense? If
I'm seeing things correctly, this way you might be able to clean up some of the
newly introduced code.

Fix streaming job failures with WindowsResourceCalculatorPlugin
---

Key: MAPREDUCE-4369
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4369
Project: Hadoop Map/Reduce
Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Attachments: MAPREDUCE-4369.branch-1-win.1.patch

Some streaming jobs use local mode job runs that do not start tasks trackers.
In these cases, the jvm context is not setup and hence local mode execution
causes the code to crash.
Fix is to not not use ResourceCalculatorPlugin in such cases or make the
local job run creating dummy jvm contexts. Choosing the first option because
thats the current implicit behavior in Linux. The ProcfsBasedProcessTree
(used inside the LinuxResourceCalculatorPlugin) does no real work when the
process pid is not setup correctly. This is what happens when local job mode
runs.

[jira] [Created] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory

Devaraj K created MAPREDUCE-4380:


 Summary: Empty Userlogs directory is getting created under logs 
directory
 Key: MAPREDUCE-4380
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4380
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Devaraj K


Empty Userlogs directory is getting created under logs directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory