[jira] [Commented] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414298#comment-13414298
 ] 

Hadoop QA commented on MAPREDUCE-4427:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536488/MAPREDUCE-4427-3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 4 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2592//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2592//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2592//console

This message is automatically generated.

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch, 
> MAPREDUCE-4427-3.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4309) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2012-07-13 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414294#comment-13414294
 ] 

Junping Du commented on MAPREDUCE-4309:
---

Yes. That's also what I am thinking. As using configuration based reflection to 
create each related objects will make configuring work complex and easy to get 
mistakes for user (also I think most of these objects are belonging to 
implementation details).
So, I am thinking of two options based on factory pattern:
1. Add only one property in configure to mark the topology is with nodegroup, 
then we have a factory class to construct objects base on configuration 
property.
2. Leverage the pattern of "Abstract Factory": use reflection to create factory 
class according to property in configuration, then use factory to create 
different objects accordingly.
I prefer Option 2. What do you think?
@Nicholas, do you have any comments here?

> Make locatlity in YARN's container assignment and task scheduling pluggable 
> for other deployment topology
> -
>
> Key: MAPREDUCE-4309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4309
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: 
> HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
> MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, 
> MAPREDUCE-4309.patch
>
>
> There are several classes in YARN’s container assignment and task scheduling 
> algorithms that relate to data locality which were updated to give preference 
> to running a container on other locality besides node-local and rack-local 
> (like nodegroup-local). This propose to make these data structure/algorithms 
> pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
> ScheduledRequests was made a package level class to it would be easier to 
> create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

2012-07-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414285#comment-13414285
 ] 

Konstantin Shvachko commented on MAPREDUCE-4349:


I would rather integrated verification of archive files in 
{{testCacheConsistency()}} instead of creating a new test case.
You can do
{code}
DistributedCache.addCacheFile(firstCacheFile ...
DistributedCache.addCacheArchive(firstCacheArchive ...
{code}
And then add verification for the archive along with the file.
I think it will be less change, and definitely less code replication.
Otherwise it will need refactoring to extract common parts of code into methods.

> Distributed Cache gives inconsistent result if cache Archive files get 
> deleted from task tracker 
> -
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0, 1.0.3, trunk
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414275#comment-13414275
 ] 

Zhihong Ted Yu commented on MAPREDUCE-4445:
---

Welcome.
I should paid more attention to my Inbox :-)

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4405) Adding test case for HierarchicalQueue in TestJobQueueClient

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414251#comment-13414251
 ] 

Hudson commented on MAPREDUCE-4405:
---

Integrated in Hadoop-Mapreduce-22-branch #110 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/110/])
MAPREDUCE-4405. Test case for HierarchicalQueue in TestJobQueueClient. 
Contributed by Mayank Bansal. (Revision 1361335)

 Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361335
Files : 
* /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
* 
/hadoop/common/branches/branch-0.22/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobQueueClient.java


> Adding test case for HierarchicalQueue in TestJobQueueClient
> 
>
> Key: MAPREDUCE-4405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4405
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-4405-22-v2.patch, MAPREDUCE-4405-22.patch
>
>
> Adding test case for HierarchicalQueue in TestJobQueueClient

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414245#comment-13414245
 ] 

Karthik Kambatla commented on MAPREDUCE-4445:
-

Hi Zhihong,

Yes. My mistake, I missed adding the link earlier and added only after seeing 
this JIRA. Sorry for the trouble. Thanks a lot for the patch, though. 

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414238#comment-13414238
 ] 

Zhihong Ted Yu commented on MAPREDUCE-4445:
---

The link to MAPREDUCE-4441 was added to MAPREDUCE-3451 just now, right ?
I didn't see it earlier.

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4309) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2012-07-13 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414235#comment-13414235
 ] 

Bikas Saha commented on MAPREDUCE-4309:
---

I am wondering if reflection is the best way to incorporate plugins? Can we 
replace with a factory pattern or something?

> Make locatlity in YARN's container assignment and task scheduling pluggable 
> for other deployment topology
> -
>
> Key: MAPREDUCE-4309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4309
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: 
> HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
> MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, 
> MAPREDUCE-4309.patch
>
>
> There are several classes in YARN’s container assignment and task scheduling 
> algorithms that relate to data locality which were updated to give preference 
> to running a container on other locality besides node-local and rack-local 
> (like nodegroup-local). This propose to make these data structure/algorithms 
> pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
> ScheduledRequests was made a package level class to it would be easier to 
> create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4427:
--

Status: Patch Available  (was: Open)

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch, 
> MAPREDUCE-4427-3.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414228#comment-13414228
 ] 

Bikas Saha commented on MAPREDUCE-4427:
---

Actually I will remove the "lfs.mkdir(containerDir, null, true)" change. It is 
unrelated to this jira. It was a workaround to an existing bug that task launch 
will fail if there are no cache items associated with it because in that case 
localization is not triggered.

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch, 
> MAPREDUCE-4427-3.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4427:
--

Attachment: MAPREDUCE-4427-3.patch

Attach new patch based on comments

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch, 
> MAPREDUCE-4427-3.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4427:
--

Status: Open  (was: Patch Available)

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414226#comment-13414226
 ] 

Bikas Saha commented on MAPREDUCE-4427:
---

The client side patch should exemplify the use case clearly.
Yes, we can implement a check mechanism if needed.
I am guessing that queue checks would depend on actual container resource 
allocation made to jobs and so would not count resources for these AM's because 
they are not allocated from cluster resources.
RM will do everything it does for a normal AM except cleanup the AM container. 
So task containers would be killed and AM unregistered from AppMasterService. 
If the AM actually continues running then it will get a REBOOT response from 
the AppMasterService on the next allocate() heartbeat and also will not get any 
more container assignments. So it will be practically useless.
Sorry, I forgot to remove the comment. Your understanding is correct. I will 
fix the comment.

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4422) YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414225#comment-13414225
 ] 

Hadoop QA commented on MAPREDUCE-4422:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536485/MAPREDUCE-4422_rev4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2591//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2591//console

This message is automatically generated.

> YARN_APPLICATION_CLASSPATH needs a documented default value in 
> YarnConfiguration
> 
>
> Key: MAPREDUCE-4422
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4422
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4422.patch, MAPREDUCE-4422_rev2.patch, 
> MAPREDUCE-4422_rev3.patch, MAPREDUCE-4422_rev3.patch, 
> MAPREDUCE-4422_rev4.patch
>
>
> MAPREDUCE-3505 allowed YARN_APPLICATION_CLASSPATH to be configurable.
> However, we didn't add a default value to YarnConfiguration, as-is the norm.
> Ran into it while investigating MAPREDUCE-4421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4445:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Duplicate of MR-4441

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414205#comment-13414205
 ] 

Karthik Kambatla commented on MAPREDUCE-4445:
-

Zhihong, MR-4441 already addressed this, and has been committed to trunk and 
branch-2.

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4422) YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration

2012-07-13 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4422:


Attachment: MAPREDUCE-4422_rev4.patch

Nice NIT! Thanks Tucu. Here is the updated patch.

> YARN_APPLICATION_CLASSPATH needs a documented default value in 
> YarnConfiguration
> 
>
> Key: MAPREDUCE-4422
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4422
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4422.patch, MAPREDUCE-4422_rev2.patch, 
> MAPREDUCE-4422_rev3.patch, MAPREDUCE-4422_rev3.patch, 
> MAPREDUCE-4422_rev4.patch
>
>
> MAPREDUCE-3505 allowed YARN_APPLICATION_CLASSPATH to be configurable.
> However, we didn't add a default value to YarnConfiguration, as-is the norm.
> Ran into it while investigating MAPREDUCE-4421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4422) YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration

2012-07-13 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4422:


Status: Patch Available  (was: Reopened)

> YARN_APPLICATION_CLASSPATH needs a documented default value in 
> YarnConfiguration
> 
>
> Key: MAPREDUCE-4422
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4422
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4422.patch, MAPREDUCE-4422_rev2.patch, 
> MAPREDUCE-4422_rev3.patch, MAPREDUCE-4422_rev3.patch, 
> MAPREDUCE-4422_rev4.patch
>
>
> MAPREDUCE-3505 allowed YARN_APPLICATION_CLASSPATH to be configurable.
> However, we didn't add a default value to YarnConfiguration, as-is the norm.
> Ran into it while investigating MAPREDUCE-4421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated MAPREDUCE-4445:
--

Attachment: MAPREDUCE-4445.patch

Simple patch.

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated MAPREDUCE-4445:
--

Status: Patch Available  (was: Open)

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: MAPREDUCE-4445
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
> Attachments: MAPREDUCE-4445.patch
>
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414176#comment-13414176
 ] 

Zhihong Ted Yu commented on MAPREDUCE-3451:
---

MAPREDUCE-4445 is the correct JIRA, sorry about this.

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4445) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)
Zhihong Ted Yu created MAPREDUCE-4445:
-

 Summary: TestFSSchedulerApp should be in scheduler.fair package
 Key: MAPREDUCE-4445
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4445
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Zhihong Ted Yu


MAPREDUCE-3451 added Fair Scheduler to MRv2

TestFSSchedulerApp was added under 
src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair but 
its package was declared to be 
org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414173#comment-13414173
 ] 

Zhihong Ted Yu commented on MAPREDUCE-3451:
---

HBASE-6395 has been created for putting TestFSSchedulerApp in the right package.

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414163#comment-13414163
 ] 

Hudson commented on MAPREDUCE-3451:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2486 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2486/])
MAPREDUCE-3451. Amendment, excluding findbugs warnings (tucu) (Revision 
1361436)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361436
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/dev-support/findbugs-exclude.xml


> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4422) YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration

2012-07-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414145#comment-13414145
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4422:
---

Ahmed, one NIT I see: The default value should be a String[], doing the 
"".split() in teh constant definition. Then when using it as default value 
there is not split() call over and over.

> YARN_APPLICATION_CLASSPATH needs a documented default value in 
> YarnConfiguration
> 
>
> Key: MAPREDUCE-4422
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4422
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4422.patch, MAPREDUCE-4422_rev2.patch, 
> MAPREDUCE-4422_rev3.patch, MAPREDUCE-4422_rev3.patch
>
>
> MAPREDUCE-3505 allowed YARN_APPLICATION_CLASSPATH to be configurable.
> However, we didn't add a default value to YarnConfiguration, as-is the norm.
> Ran into it while investigating MAPREDUCE-4421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414143#comment-13414143
 ] 

Hadoop QA commented on MAPREDUCE-4283:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536472/MAPREDUCE-4283.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2590//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2590//console

This message is automatically generated.

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch, 
> MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4406) Users should be able to specify the MiniCluster ResourceManager and JobHistoryServer ports

2012-07-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414139#comment-13414139
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4406:
---

The first patch would reintroduce fixed ports in the minicluster, we don't want 
that for running testcases.

We could achieve this by having a new config property, ie 
'hadoop.minicluster.fixed.ports' = FALSE (default).

If this property is FALSE, we use the current random port configuration.

If this property is TRUE, we use the fixed port behavior from configuration.

Then MAPREDUCE-987 would set this property to TRUE before starting the 
minicluster.

This wouldn't change the current random behavior which is desirable for 
dev/jenkins.



> Users should be able to specify the MiniCluster ResourceManager and 
> JobHistoryServer ports
> --
>
> Key: MAPREDUCE-4406
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4406
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4406.patch
>
>
> There is use-cases where users may need to specify the ports used for the 
> resource manager and history server for the minicluster.
> In the current implementation, the MiniCluster sets these addresses 
> regardless of them being already set by the user in the conf.
> Users should be able to add these properties to the conf and in such case the 
> MiniCluster will use the specified addresses. If not specified then the 
> current behavior of the MiniCluster for explicitly setting the addresses will 
> be used.
> I'll be uploading a patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414133#comment-13414133
 ] 

Hudson commented on MAPREDUCE-3451:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2532 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2532/])
MAPREDUCE-3451. Amendment, excluding findbugs warnings (tucu) (Revision 
1361436)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361436
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/dev-support/findbugs-exclude.xml


> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414132#comment-13414132
 ] 

Hudson commented on MAPREDUCE-3451:
---

Integrated in Hadoop-Common-trunk-Commit #2466 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2466/])
MAPREDUCE-3451. Amendment, excluding findbugs warnings (tucu) (Revision 
1361436)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361436
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/dev-support/findbugs-exclude.xml


> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4283:
--

Attachment: MAPREDUCE-4283.patch

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch, 
> MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4283:
--

Status: Patch Available  (was: Open)

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch, 
> MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4439:
--

Description: 
Committed findbugs exclusions 
(hadoop-mapreduce-project/hadoop-yarn/dev-support/findbugs-exclude.xml) as an 
amendment of MAPREDUCE-3451. 

Lower priority to major as warnings are excluded.

Reassigning to Patrick to verify&disregard or fix the warning issues. If the 
warnings are invalid please close this JIRA as won't fix.






   Priority: Major  (was: Blocker)
   Assignee: Patrick Wendell  (was: Alejandro Abdelnur)

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Patrick Wendell
>
> Committed findbugs exclusions 
> (hadoop-mapreduce-project/hadoop-yarn/dev-support/findbugs-exclude.xml) as an 
> amendment of MAPREDUCE-3451. 
> Lower priority to major as warnings are excluded.
> Reassigning to Patrick to verify&disregard or fix the warning issues. If the 
> warnings are invalid please close this JIRA as won't fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4414) Add main methods to JobConf and YarnConfiguration, for debug purposes

2012-07-13 Thread Linden Hillenbrand (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Linden Hillenbrand reassigned MAPREDUCE-4414:
-

Assignee: Linden Hillenbrand

> Add main methods to JobConf and YarnConfiguration, for debug purposes
> -
>
> Key: MAPREDUCE-4414
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4414
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Linden Hillenbrand
>  Labels: newbie
>
> Just like Configuration has a main() func that dumps XML out for debug 
> purposes, we should have a similar function under the JobConf and 
> YarnConfiguration classes that do the same. This is useful in testing out app 
> classpath setups at times.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414117#comment-13414117
 ] 

Hudson commented on MAPREDUCE-4299:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2485 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2485/])
MAPREDUCE-4299. Terasort hangs with MR2 FifoScheduler (Tom White via bobby) 
(Revision 1361397)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361397
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java


> Terasort hangs with MR2 FifoScheduler
> -
>
> Key: MAPREDUCE-4299
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4299.patch, MAPREDUCE-4299.patch, 
> MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4283:
--

Status: Open  (was: Patch Available)

Canceling patch to address findbug warning.

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414112#comment-13414112
 ] 

Hadoop QA commented on MAPREDUCE-4283:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536467/MAPREDUCE-4283.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2589//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2589//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2589//console

This message is automatically generated.

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers

2012-07-13 Thread Andrew Ferguson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414108#comment-13414108
 ] 

Andrew Ferguson commented on MAPREDUCE-4334:


Arun -- I think we might be talking past each other, as we agree that both 
cgroups and taskset should be available.

BTW, it turns out the sched_setaffinity() syscall does not require root if it 
is applied to a process you own. Therefore, if you are running with the 
DefaultContainerExecutor, you can still use sched_setaffinity, which is 
excellent.


I think this is the matrix of possible use cases:
1) launch container as user & use sched_setaffinity / taskset / CPU pinning
2) launch container as user & use cgroups completely managed by Hadoop
3) launch container as user & use cgroups managed by the cluster operator
4) launch container as Hadoop & use sched_setaffinity / taskset / CPU pinning
5) launch container as Hadoop & use cgroups completely managed by Hadoop
6) launch container as Hadoop & use cgroups managed by the cluster operator

Cases 1, 2, 3 and 5 require root privs.

Cases 3 and 6 are covered by the patch above.

I'm happy to expand the LCE into a "hadoop root tool" which can be used in 
cases 1, 2, 3, and 5.

In my mind, the design question is how to cover all six cases with the most 
amount of code re-use.

Today, we have two important ContainerManager subsystems: the Launcher and the 
Monitor. Today, reforce enforcement is entirely done within the Monitor. The 
question is, where should new resource enforcement be done? I think the answer 
is still "in the Monitor" even though, in some use cases, it needs access to 
root privs. To get access to those privs, it can call the LCE binary (aka the 
"hadoop root tool"), just as the java-side of the LCE does today.

So, concretely, this is my proposal:
- recognize the LCE binary as the "hadoop root tool"
- the LCE will have two new functionalities: 1) sched_setaffinity and 2) 
creating cgroups
- in addition to the patch above, I will create 1) another pluggable 
ContainersMonitor which can use these new functions (sched_setaffinity) and 2) 
adapt the one above to optionally use the (creating cgroups) functionality of 
the "hadoop root tool"

how does that sound?




> Add support for CPU isolation/monitoring of containers
> --
>
> Key: MAPREDUCE-4334
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Arun C Murthy
>Assignee: Andrew Ferguson
> Attachments: MAPREDUCE-4334-pre1.patch, 
> MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre2.patch, 
> MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-pre3.patch
>
>
> Once we get in MAPREDUCE-4327, it will be important to actually enforce 
> limits on CPU consumption of containers. 
> Several options spring to mind:
> # taskset (RHEL5+)
> # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2355) Add an out of band heartbeat damper

2012-07-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414103#comment-13414103
 ] 

Todd Lipcon commented on MAPREDUCE-2355:


Should this be marked as resolved? It seems to have been committed in 0.20.202.

> Add an out of band heartbeat damper
> ---
>
> Key: MAPREDUCE-2355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2355
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Attachments: mr-2355-from-sec-branch.txt
>
>
> We should have a configurable knob to throttle how many out of band 
> heartbeats are sent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4283:
--

Attachment: MAPREDUCE-4283.patch

Good catch.  Also I noticed there was a bug if we try to skip forward in the 
log and we encounter a premature EOF.  Incorporated fixes for those in an 
updated patch.

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch, MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4444) nodemanager fails to start when one of the local-dirs is bad

2012-07-13 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414088#comment-13414088
 ] 

Nathan Roberts commented on MAPREDUCE-:
---

disk_fail_in_place should allow a volume to fail and for the nodemanager to 
continue to function. It does seem to obey 
yarn.nodemanager.disk-health-checker.min-healthy-disks while it's up, but after 
a disk has failed, it no longer starts.

 [main]2012-07-11 20:58:19,857 FATAL 
org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
NodeManager
 [main]org.apache.hadoop.yarn.YarnException: Failed to initialize 
LocalizationService
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.init(ResourceLocalizationService.java:202)
at 
org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.init(ContainerManagerImpl.java:183)
at 
org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.init(NodeManager.java:159)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:260)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:276)
Caused by: EROFS: Read-only file system
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:562)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:369)
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:888)
at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
at 
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2319)  
  at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.init(ResourceLocalizationService.java:188)
... 6 more


> nodemanager fails to start when one of the local-dirs is bad
> 
>
> Key: MAPREDUCE-
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
>Reporter: Nathan Roberts
>Priority: Blocker
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4419) ./mapred queue -info -showJobs displays all the jobs irrespective of

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414083#comment-13414083
 ] 

Hudson commented on MAPREDUCE-4419:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2484 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2484/])
MAPREDUCE-4419. ./mapred queue -info  -showJobs displays all the 
jobs irrespective of  (Devaraj K via bobby) (Revision 1361389)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361389
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobQueueClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobStatus.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> ./mapred queue -info  -showJobs displays all the jobs irrespective 
> of  
> -
>
> Key: MAPREDUCE-4419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4419.patch, screenshot-1.jpg, screenshot-2.jpg
>
>
> ./mapred queue -info  -showJobs shows all the jobs irrespective of 
> 
> In Queue name field all the jobs are showing as default queue but they are 
> submitted to the configured queue(see screenshots attached).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414082#comment-13414082
 ] 

Hudson commented on MAPREDUCE-4441:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2484 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2484/])
MAPREDUCE-4441. Fix build issue caused by MR-3451 (kkambatl via tucu) 
(Revision 1361387)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361387
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSSchedulerApp.java


> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4444) nodemanager fails to start when one of the local-dirs is bad

2012-07-13 Thread Nathan Roberts (JIRA)
Nathan Roberts created MAPREDUCE-:
-

 Summary: nodemanager fails to start when one of the local-dirs is 
bad
 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.0-alpha, 0.23.3, 3.0.0
Reporter: Nathan Roberts
Priority: Blocker




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows

2012-07-13 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated MAPREDUCE-4374:
-

Attachment: MAPREDUCE-4374-branch-1-win-2.patch

{quote}
{code}
static final String DEFAULT_HOME_DIR =
  System.getenv(Shell.WINDOWS ? "USERPROFILE" : "HOME");
{code}
{quote}
I think this is an old private constant variable that is only used in the 
TaskRunner class; also, it aligns with other private constant variables in the 
class, e.g. DEFAULT_MAPRED_ADMIN_JAVA_OPTS, DEFAULT_SHELL. So I am in favor to 
keep it here than put in Shell.

{quote}
Firstly, why do we need this when non-Windows gets away without it? It just 
seems to be pure Java code. Secondly, if we know what abstract operation FOO we 
are doing here, then can we push it into Shell.FOO()?
{quote}
About triming leading and trailing quotes on Windows, this is due to a subtle 
difference of ‘set’ on cmd shell and ‘export’ on bash shell. According to the 
[set document|http://technet.microsoft.com/en-us/library/bb490998] which I 
quoted below:
??The characters <, >, |, &, ^ are special command shell characters and must be 
either preceded by the escape character (^) or enclosed in quotation marks when 
used in string (that is, "StringContaining&Symbol". If you use quotation marks 
to enclose a string containing one of the special characters, *the quotation 
marks are set as part of the environment variable value*.??
On Linux, quotation marks are not set as part of the environment variable by 
‘export’.
To launch child tasks with correct environment, we setup the environment by 
writing a series of ‘set’ or ‘export’ command (cf. TaskRunner.run() method) to 
TaskController.COMMAND_FILE, and execute the file to launch the Java job (cf. 
DefaultTaskController.launchTask() method). So when we receive the environment 
variables on Windows, we need to trim the leading and ending quotes around them 
in the program.
These are not perfect solutions. However notice this is the existing behavior 
from the previous change which may work for majority of cases. I only changed 
the code to make it more concise. Let’s not push this to Shell as I think the 
code is simply enough to be understood and we can work towards a better 
abstraction for this (handling ‘set’ vs ‘export’) in the future.

{quote}
Why this code addition? Is this a general bug you have found?
{quote}
Notice this is tmp directory. On Linux /tmp exists by default which is not the 
case on Windows.

{quote}
Should be easy to get the regex pattern for env vars from Shell?
{quote}
Moved to Shell in the new Patch.

{quote}
Should easily move inside Shell.getTempPath()?
{quote}
The folders here are really only just text string. The test is really just 
testing if the environment variable is set correctly. We make them folder names 
to make them more like real world scenario because that is a major use of 
environment variable in practice.

{quote}
This could easily use Shell.getUserHome()?
{quote}
Again, the purpose here is not to get user home, but to test an environment 
variable. I don’t think there is a need to create Shell function for this. The 
environment variables are very platform dependent. So I think we should accept 
the platform dependence here.


> Fix child task environment variable config and add support for Windows
> --
>
> Key: MAPREDUCE-4374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
>Reporter: Chuan Liu
>Assignee: Chuan Liu
>Priority: Minor
> Attachments: MAPREDUCE-4374-branch-1-win-2.patch, 
> MAPREDUCE-4374-branch-1-win.patch
>
>
> In HADOOP-2838, a new feature was introduced to set environment variables via 
> the Hadoop config 'mapred.child.env' for child tasks. There are some further 
> fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix; 
> MAPREDUCE-478 broke the config into 'mapred.map.child.env' and 
> 'mapred.reduce.child.env'.  However the current implementation is still not 
> complete. It does not match its documentation or original intend as I 
> believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration 
> syntax, we will have problems using them on Windows because ‘:’ appears very 
> often in Windows path as in “C:\”, and environment variables are used very 
> often to hold path names. The Jira is created to fix the problem and provide 
> support on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4437) Race in MR ApplicationMaster can cause reducers to never be scheduled

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414072#comment-13414072
 ] 

Hadoop QA commented on MAPREDUCE-4437:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536456/MAPREDUCE-4437.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2588//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2588//console

This message is automatically generated.

> Race in MR ApplicationMaster can cause reducers to never be scheduled
> -
>
> Key: MAPREDUCE-4437
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4437
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4437.patch
>
>
> If the MR AM is notified of container completion by the RM before the AM 
> receives notification of the container cleanup from the NM then it can fail 
> to schedule reducers indefinitely.  Logs showing the issue to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3562) Concurrency issues in MultipleOutputs,JobControl,Counters

2012-07-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3562:
---

Status: Open  (was: Patch Available)

Canceling patch until the review comments can be addressed.

> Concurrency issues in MultipleOutputs,JobControl,Counters
> -
>
> Key: MAPREDUCE-3562
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3562
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Ravi Teja Ch N V
>Assignee: Ravi Teja Ch N V
> Attachments: MAPREDUCE-3562.patch
>
>
> bq.MultipleOutputs 
>   The close of recordwriters should be synchronized. 
>   public void close() throws IOException, InterruptedException { 
> for (RecordWriter writer : recordWriters.values()) { 
>   writer.close(context); 
> bq.JobControl.java 
>   the getters of the jobs to be synchronized. 
> bq.Counters.java 
>makeEscapedCompactString to be made synchronized. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414062#comment-13414062
 ] 

Karthik Kambatla commented on MAPREDUCE-4439:
-

True, Alejandro is working on it, as we speak.

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4439:


Attachment: (was: MR-4439.patch)

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414058#comment-13414058
 ] 

Hudson commented on MAPREDUCE-4299:
---

Integrated in Hadoop-Common-trunk-Commit #2465 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2465/])
MAPREDUCE-4299. Terasort hangs with MR2 FifoScheduler (Tom White via bobby) 
(Revision 1361397)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361397
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java


> Terasort hangs with MR2 FifoScheduler
> -
>
> Key: MAPREDUCE-4299
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4299.patch, MAPREDUCE-4299.patch, 
> MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414059#comment-13414059
 ] 

Arun C Murthy commented on MAPREDUCE-4439:
--

Uh, we'll need to suppress findbugs jobs?

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Attachments: MR-4439.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414060#comment-13414060
 ] 

Arun C Murthy commented on MAPREDUCE-4439:
--

Meant findbugs warnings.

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Attachments: MR-4439.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414056#comment-13414056
 ] 

Hudson commented on MAPREDUCE-4299:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2531 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2531/])
MAPREDUCE-4299. Terasort hangs with MR2 FifoScheduler (Tom White via bobby) 
(Revision 1361397)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361397
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java


> Terasort hangs with MR2 FifoScheduler
> -
>
> Key: MAPREDUCE-4299
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4299.patch, MAPREDUCE-4299.patch, 
> MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4437) Race in MR ApplicationMaster can cause reducers to never be scheduled

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4437:
--

Assignee: Jason Lowe
Target Version/s: 0.23.3, 2.0.1-alpha
  Status: Patch Available  (was: Open)

> Race in MR ApplicationMaster can cause reducers to never be scheduled
> -
>
> Key: MAPREDUCE-4437
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4437
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4437.patch
>
>
> If the MR AM is notified of container completion by the RM before the AM 
> receives notification of the container cleanup from the NM then it can fail 
> to schedule reducers indefinitely.  Logs showing the issue to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4437) Race in MR ApplicationMaster can cause reducers to never be scheduled

2012-07-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4437:
--

Attachment: MAPREDUCE-4437.patch

Patch to recalculate the reduce schedule if the number of completed tasks in 
the job changes from the last time we recalculated.

> Race in MR ApplicationMaster can cause reducers to never be scheduled
> -
>
> Key: MAPREDUCE-4437
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4437
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Priority: Critical
> Attachments: MAPREDUCE-4437.patch
>
>
> If the MR AM is notified of container completion by the RM before the AM 
> receives notification of the container cleanup from the NM then it can fail 
> to schedule reducers indefinitely.  Logs showing the issue to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers

2012-07-13 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414053#comment-13414053
 ] 

Arun C Murthy commented on MAPREDUCE-4334:
--

Andrew - please don't this the wrong way, I certainly am *not* trying to debate 
taskset v/s cgroups. All I'm saying is 'we need both' for the dominant 
platforms: RHEL5 and RHEL6. I perfectly understand that you might not have the 
time or the inclination to do both, and I'm happy to help, personally - 
supporting just RHEL6 isn't enough.

Given that, we have two options:
# Admin-setup cgroups (outside YARN) 
# YARN handles it on it's own via LCE

Now the pros of using LCE:
# It already exists! Hence it doesn't require any *new* operational 
requirements. 
# It's consistent for both technologies/platforms we need to support: 
taskset/RHEL5 and cgroups/RHEL6. 
# Even better, we can use the same for any platform in the future e.g. 
WindowsContainerExecutor (for e.g. we already have WindowsTaskController in 
branch-1-win and would need to get ported to branch-2 soon).
# It's *much lesser* overhead on admins - they don't have to create cgroups 
upfront, they don't have to mount them to get them to survive reboots etc.

Cons:
# Need LCE for non-secure setups. We actually did support LTC without security 
in branch-1 at some point, happy to discuss.

In the alternate (admin-setup groups) we will _still_ need LCE (or worse, 
*another* setuid script) to support taskset. To me that is a very bad choice.

As a result, using LCE seems like a significantly superior alternative.



Some other comments:

bq. In my mind, the LCE is for starting processes, and should stick to doing 
that. 

Not true at all, we already use it for container cleanup etc. 

{quote}
4) For cgroups, we could have a second ContainersMonitor plugin which uses a 
setuid root binary to also mount & create cgroups, freeing the admin from 
managing them at all.
5) For taskset, we can implement a ContainersMonitor which uses a setuid root 
binary (potentially the LCE, but perhaps better if it's something else, just to 
keep the security footprint down) to pin processes to CPUs. This 
ContainersMonitor will also need the memory enforcement code from the current 
ContainersMonitorImpl
{quote}

Like I said above, have two ways to do the same when we can do with one 
*existing* component i.e. LCE seems like a clear choice.

I understand you might not have time to port your work via LCE, I'm happy to 
either help or take up that work.

> Add support for CPU isolation/monitoring of containers
> --
>
> Key: MAPREDUCE-4334
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Arun C Murthy
>Assignee: Andrew Ferguson
> Attachments: MAPREDUCE-4334-pre1.patch, 
> MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre2.patch, 
> MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-pre3.patch
>
>
> Once we get in MAPREDUCE-4327, it will be important to actually enforce 
> limits on CPU consumption of containers. 
> Several options spring to mind:
> # taskset (RHEL5+)
> # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4439:
---

Assignee: Alejandro Abdelnur  (was: Karthik Kambatla)

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Attachments: MR-4439.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4443) Yarn framework components (AM, job history server) should be resilient to applications exceeding counter limits

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4443:
--

Attachment: am_failed_counter_limits.txt

Attached full application master logs illustrating the failure

> Yarn framework components (AM, job history server) should be resilient to 
> applications exceeding counter limits 
> 
>
> Key: MAPREDUCE-4443
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
> Attachments: am_failed_counter_limits.txt
>
>
> We saw this problem migrating applications to MapReduceV2:
> Our applications use hadoop counters extensively (1000+ counters for certain 
> jobs). While this may not be one of recommended best practices in hadoop, the 
> real issue here is reliability of the framework when applications exceed 
> counter limits.
> The hadoop servers (yarn, history server) were originally brought up with 
> mapreduce.job.counters.max=1000 under core-site.xml
> We then ran map-reduce job under an application using its own job specific 
> overrides, with  mapreduce.job.counters.max=1
> All the tasks for the job finished successfully; however the overall job 
> still failed due to AM encountering exceptions as:
> {code}
> 2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks
> : 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa
> dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
> counters: 1001 max=1000
> at 
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58) 
>at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65)
> at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77)
> at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94)
> at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105)
> at 
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202)
> at 
> org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74)   
>  at java.lang.Thread.run(Thread.java:662)
> 2012-07-12 17:31:43,502 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..2012-07-12 
> 17:31:43,503 INFO [Thread-1] org.apache.had
> {code}
> The overall jo

[jira] [Updated] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

2012-07-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4299:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha
   0.23.3
   Status: Resolved  (was: Patch Available)

Thanks for the fix Tom.  I merge this into trunk, branch-2, and branch-0.23

> Terasort hangs with MR2 FifoScheduler
> -
>
> Key: MAPREDUCE-4299
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4299.patch, MAPREDUCE-4299.patch, 
> MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4419) ./mapred queue -info -showJobs displays all the jobs irrespective of

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414047#comment-13414047
 ] 

Hudson commented on MAPREDUCE-4419:
---

Integrated in Hadoop-Common-trunk-Commit #2464 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2464/])
MAPREDUCE-4419. ./mapred queue -info  -showJobs displays all the 
jobs irrespective of  (Devaraj K via bobby) (Revision 1361389)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361389
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobQueueClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobStatus.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> ./mapred queue -info  -showJobs displays all the jobs irrespective 
> of  
> -
>
> Key: MAPREDUCE-4419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4419.patch, screenshot-1.jpg, screenshot-2.jpg
>
>
> ./mapred queue -info  -showJobs shows all the jobs irrespective of 
> 
> In Queue name field all the jobs are showing as default queue but they are 
> submitted to the configured queue(see screenshots attached).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4442:
--

Description: 
We found this issue during our tests moving from MapReduceV1 to MapReduceV2. A 
few of our applications access job counters multiple times:

a) After submission of job, while job is execution (works fine)

b) Right after job complete notification is received (works fine)

c) Few seconds after job complete notification (fails most of the time).

The error snippet is as follows:

{code}
2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
connection Thread[IPC Client (1252749669) connection to 
sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,216 ERROR [UserGroupInformation] PriviledgedActionException 
as:hadoop (auth:SIMPLE) cause:java.io.IOException
2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
retrieve counters. null
java.io.IOException
at 
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
at 
org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
at 
org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
{code}

The connection to 10.202.50.187:47944 is actually the connection to AM; appears 
that we are connecting to AM to get the counters for the successful job and not 
yet to the history server.
 
I'll attach the logs for AM and resource mgr separately, however no unusual 
activity is seen in those.

This makes me suspect that we have a race condition in the code trying to 
access job counters when AM is finishing up and the job hasn't moved to history 
server yet.

  was:
We found this issue during our tests moving from MapReduceV1 to MapReduceV2. A 
few of our applications access job counters multiple times:

a) After submission of job, while job is execution (works fine)

b) Right after job complete notification is received (works fine)

c) Few seconds after job complete notification (fails most of the times).

The error snippet is as follows:

{code}
2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
connection Thread[IPC Client (1252749669) connection to 
sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,216 ERROR [UserGroupInformation] PriviledgedActionException 
as:hadoop (auth:SIMPLE) cause:java.io.IOException
2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
retrieve counters. null
java.io.IOException
at 
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
at 
org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
at 
org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.jav

[jira] [Commented] (MAPREDUCE-4419) ./mapred queue -info -showJobs displays all the jobs irrespective of

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414045#comment-13414045
 ] 

Hudson commented on MAPREDUCE-4419:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2530 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2530/])
MAPREDUCE-4419. ./mapred queue -info  -showJobs displays all the 
jobs irrespective of  (Devaraj K via bobby) (Revision 1361389)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361389
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobQueueClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobStatus.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> ./mapred queue -info  -showJobs displays all the jobs irrespective 
> of  
> -
>
> Key: MAPREDUCE-4419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4419.patch, screenshot-1.jpg, screenshot-2.jpg
>
>
> ./mapred queue -info  -showJobs shows all the jobs irrespective of 
> 
> In Queue name field all the jobs are showing as default queue but they are 
> submitted to the configured queue(see screenshots attached).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414046#comment-13414046
 ] 

Hudson commented on MAPREDUCE-4441:
---

Integrated in Hadoop-Common-trunk-Commit #2464 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2464/])
MAPREDUCE-4441. Fix build issue caused by MR-3451 (kkambatl via tucu) 
(Revision 1361387)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361387
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSSchedulerApp.java


> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414044#comment-13414044
 ] 

Hudson commented on MAPREDUCE-4441:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2530 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2530/])
MAPREDUCE-4441. Fix build issue caused by MR-3451 (kkambatl via tucu) 
(Revision 1361387)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1361387
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSSchedulerApp.java


> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4442:
--

Summary: Accessing hadoop counters from a job is unreliable in yarn during 
AM process cleanup  window  (was: Accessing hadoop counters from a job is 
unreliable in yarn during in AM process cleanup  window)

> Accessing hadoop counters from a job is unreliable in yarn during AM process 
> cleanup  window
> 
>
> Key: MAPREDUCE-4442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
> Attachments: am_logs_counter_failure.html, 
> rsrc_mgr_logs_counter_failed.txt
>
>
> We found this issue during our tests moving from MapReduceV1 to MapReduceV2. 
> A few of our applications access job counters multiple times:
> a) After submission of job, while job is execution (works fine)
> b) Right after job complete notification is received (works fine)
> c) Few seconds after job complete notification (fails most of the times).
> The error snippet is as follows:
> {code}
> 2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
> connection Thread[IPC Client (1252749669) connection to 
> sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
> 2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,216 ERROR [UserGroupInformation] 
> PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException
> 2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
> retrieve counters. null
> java.io.IOException
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
>   at 
> org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
> {code}
> The connection to 10.202.50.187:47944 is actually the connection to AM; 
> appears that we are connecting to AM to get the counters for the successful 
> job and not the history server.
>  
> I'll attach the logs for AM and resource mgr separately, however no unusual 
> activity is seen in those.
> This makes me suspect that we have a race condition in the code trying to 
> access job counters when AM is finishing up and the job hasn't moved to 
> history server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4428) A failed job is not available under job history if the job is killed right around the time job is notified as failed

2012-07-13 Thread Rahul Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414039#comment-13414039
 ] 

Rahul Jain commented on MAPREDUCE-4428:
---

MAPREDUCE-4443 created to track the AM reliability for counters limit exceeded 
issue.

> A failed job is not available under job history if the job is killed right 
> around the time job is notified as failed 
> -
>
> Key: MAPREDUCE-4428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4428
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, jobtracker
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
>Assignee: Robert Joseph Evans
> Attachments: am_failed_counter_limits.txt, appMaster_bad.txt, 
> appMaster_good.txt, resrcmgr_bad.txt
>
>
> We have observed this issue consistently running hadoop CDH4 version (based 
> upon 2.0 alpha release):
> In case our hadoop client code gets a notification for a completed job ( 
> using RunningJob object job, with (job.isComplete() && 
> job.isSuccessful()==false)
> the hadoop client code does an unconditional job.killJob() to terminate the 
> job.
> With earlier hadoop versions (verified on hadoop 0.20.2 version), we still  
> have full access to job logs afterwards through hadoop console. However, when 
> using MapReduceV2, the failed hadoop job no longer shows up under jobhistory 
> server. Also, the tracking URL of the job still points to the non-existent 
> Application master http port.
> Once we removed the call to job.killJob() for failed jobs from our hadoop 
> client code, we were able to access the job in job history with mapreduce V2 
> as well. Therefore this appears to be a race condition in the job management 
> wrt. job history for failed jobs.
> We do have the application master and node manager logs collected for this 
> scenario if that'll help isolate the problem and the fix better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414037#comment-13414037
 ] 

Robert Joseph Evans commented on MAPREDUCE-4299:


I am a +1 too.  I'll check this in.

> Terasort hangs with MR2 FifoScheduler
> -
>
> Key: MAPREDUCE-4299
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Tom White
>Assignee: Tom White
> Attachments: MAPREDUCE-4299.patch, MAPREDUCE-4299.patch, 
> MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4439:


Attachment: MR-4439.patch

Just added @SuppressWarnings("all").

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: MR-4439.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4439) MAPREDUCE-3451 introduced a bunch of findbugs warnings

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4439:
---

Assignee: Karthik Kambatla

> MAPREDUCE-3451 introduced a bunch of findbugs warnings
> --
>
> Key: MAPREDUCE-4439
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4439
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: MR-4439.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4443) Yarn framework components (AM, job history server) should be resilient to applications exceeding counter limits

2012-07-13 Thread Rahul Jain (JIRA)
Rahul Jain created MAPREDUCE-4443:
-

 Summary: Yarn framework components (AM, job history server) should 
be resilient to applications exceeding counter limits 
 Key: MAPREDUCE-4443
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4443
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Rahul Jain


We saw this problem migrating applications to MapReduceV2:

Our applications use hadoop counters extensively (1000+ counters for certain 
jobs). While this may not be one of recommended best practices in hadoop, the 
real issue here is reliability of the framework when applications exceed 
counter limits.

The hadoop servers (yarn, history server) were originally brought up with 
mapreduce.job.counters.max=1000 under core-site.xml

We then ran map-reduce job under an application using its own job specific 
overrides, with  mapreduce.job.counters.max=1

All the tasks for the job finished successfully; however the overall job still 
failed due to AM encountering exceptions as:

{code}
2012-07-12 17:31:43,485 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks
: 712012-07-12 17:31:43,502 FATAL [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher threa
dorg.apache.hadoop.mapreduce.counters.LimitExceededException: Too many 
counters: 1001 max=1000
at 
org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:58)   
 at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:65)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:77)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:94)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:105)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:202)
at 
org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:337)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1212)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1198)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1179)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:711)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.checkJobCompleteSuccess(JobImpl.java:737)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.checkJobForCompletion(JobImpl.java:1360)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1340)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$TaskCompletedTransition.transition(JobImpl.java:1323)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:380)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:666)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:890)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:886)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:74) 
   at java.lang.Thread.run(Thread.java:662)
2012-07-12 17:31:43,502 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..2012-07-12 
17:31:43,503 INFO [Thread-1] org.apache.had
{code}

The overall job failed, and the job history wasn't accessible either at the end 
of the job (didn't show up in job history server).

We were able to workaround the issue by changing to higher limits in 
core-site.xml and restarting yarn servers. However that forced us to increase 
the counters global limit to be as high as possible use by any individual 
application, which is hard to predict.

The original job then succeeded with new global limits. 

However, since we didn't restart the job history server, it was unable to 
display job history page for the succe

[jira] [Commented] (MAPREDUCE-4375) Show Configuration Tracability in MR UI

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414036#comment-13414036
 ] 

Robert Joseph Evans commented on MAPREDUCE-4375:


No tests were provided because it is a UI change and testing the UI has never 
been very good, and this block is already being tested as well as any other UI 
code.

> Show Configuration Tracability in MR UI
> ---
>
> Key: MAPREDUCE-4375
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4375
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 0.23.3
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: MR-4375.txt
>
>
> Once HADOOP-8525 goes in we should provide a way for the Configuration UI to 
> display the traceability information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4419) ./mapred queue -info -showJobs displays all the jobs irrespective of

2012-07-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4419:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha
   0.23.3
   Status: Resolved  (was: Patch Available)

> ./mapred queue -info  -showJobs displays all the jobs irrespective 
> of  
> -
>
> Key: MAPREDUCE-4419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4419.patch, screenshot-1.jpg, screenshot-2.jpg
>
>
> ./mapred queue -info  -showJobs shows all the jobs irrespective of 
> 
> In Queue name field all the jobs are showing as default queue but they are 
> submitted to the configured queue(see screenshots attached).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4441:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks Karthik. Committed to trunk and branch-2

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414032#comment-13414032
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4441:
---

+1

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3348) mapred job -status fails to give info even if the job is present in History

2012-07-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-3348:
---

Fix Version/s: (was: 0.23.2)
   0.23.3

Somehow this made it into trunk and branch-2, but not branch-0.23, so I put it 
in now.

> mapred job -status fails to give info even if the job is present in History
> ---
>
> Key: MAPREDUCE-3348
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3348
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 0.23.3
>
> Attachments: MAPREDUCE-3348-1.patch, MAPREDUCE-3348.patch
>
>
> It is trying to get the app report from the RM  for the job, RM throws 
> exception when it doesn't find and then it is giving the same exception 
> without trying from History Server.
> {code}
> 11/11/03 08:47:27 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy 
> for protocol interface org.apache.hadoop.mapred   
>uce.v2.api.MRClientProtocol
> 11/11/03 08:47:28 WARN mapred.ClientServiceDelegate: Exception thrown by 
> remote end.
> RemoteTrace:
>  at LocalTrace:
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Trying to get information for an absent applicat  
> ion 
> application_1320278804241_0002
> at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142)
> at $Proxy6.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie
>   
> ntImpl.java:111)
> at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353)
> at 
> org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429)
> at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186)
> at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1106)
> Exception in thread "main" RemoteTrace:
>  at Local Trace:
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Trying to get information for an absent applicat  
> ion 
> application_1320278804241_0002
> at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142)
> at $Proxy6.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClie
>   
> ntImpl.java:111)
> at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:321)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:137)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:273)
> at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:353)
> at 
> org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:429)
> at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:186)
> at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:240)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:83)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1106)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4419) ./mapred queue -info -showJobs displays all the jobs irrespective of

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414030#comment-13414030
 ] 

Robert Joseph Evans commented on MAPREDUCE-4419:


The change looks good to me +1, thanks.

> ./mapred queue -info  -showJobs displays all the jobs irrespective 
> of  
> -
>
> Key: MAPREDUCE-4419
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4419
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4419.patch, screenshot-1.jpg, screenshot-2.jpg
>
>
> ./mapred queue -info  -showJobs shows all the jobs irrespective of 
> 
> In Queue name field all the jobs are showing as default queue but they are 
> submitted to the configured queue(see screenshots attached).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during in AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4442:
--

Attachment: rsrc_mgr_logs_counter_failed.txt

Here is the snippet from resource mgr logs, relevant to the the time this 
happened.

> Accessing hadoop counters from a job is unreliable in yarn during in AM 
> process cleanup  window
> ---
>
> Key: MAPREDUCE-4442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
> Attachments: am_logs_counter_failure.html, 
> rsrc_mgr_logs_counter_failed.txt
>
>
> We found this issue during our tests moving from MapReduceV1 to MapReduceV2. 
> A few of our applications access job counters multiple times:
> a) After submission of job, while job is execution (works fine)
> b) Right after job complete notification is received (works fine)
> c) Few seconds after job complete notification (fails most of the times).
> The error snippet is as follows:
> {code}
> 2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
> connection Thread[IPC Client (1252749669) connection to 
> sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
> 2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,216 ERROR [UserGroupInformation] 
> PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException
> 2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
> retrieve counters. null
> java.io.IOException
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
>   at 
> org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
> {code}
> The connection to 10.202.50.187:47944 is actually the connection to AM; 
> appears that we are connecting to AM to get the counters for the successful 
> job and not the history server.
>  
> I'll attach the logs for AM and resource mgr separately, however no unusual 
> activity is seen in those.
> This makes me suspect that we have a race condition in the code trying to 
> access job counters when AM is finishing up and the job hasn't moved to 
> history server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during in AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4442:
--

Attachment: am_logs_counter_failure.html

Attached AM logs for the full job; the timestamps should correlate to the 
application timestamps.

> Accessing hadoop counters from a job is unreliable in yarn during in AM 
> process cleanup  window
> ---
>
> Key: MAPREDUCE-4442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
> Attachments: am_logs_counter_failure.html
>
>
> We found this issue during our tests moving from MapReduceV1 to MapReduceV2. 
> A few of our applications access job counters multiple times:
> a) After submission of job, while job is execution (works fine)
> b) Right after job complete notification is received (works fine)
> c) Few seconds after job complete notification (fails most of the times).
> The error snippet is as follows:
> {code}
> 2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
> connection Thread[IPC Client (1252749669) connection to 
> sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
> 2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-07-12 19:12:29,216 ERROR [UserGroupInformation] 
> PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException
> 2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
> retrieve counters. null
> java.io.IOException
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
>   at 
> org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
>   at 
> org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
>   at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
>   at 
> org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
> {code}
> The connection to 10.202.50.187:47944 is actually the connection to AM; 
> appears that we are connecting to AM to get the counters for the successful 
> job and not the history server.
>  
> I'll attach the logs for AM and resource mgr separately, however no unusual 
> activity is seen in those.
> This makes me suspect that we have a race condition in the code trying to 
> access job counters when AM is finishing up and the job hasn't moved to 
> history server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during in AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Jain updated MAPREDUCE-4442:
--

Description: 
We found this issue during our tests moving from MapReduceV1 to MapReduceV2. A 
few of our applications access job counters multiple times:

a) After submission of job, while job is execution (works fine)

b) Right after job complete notification is received (works fine)

c) Few seconds after job complete notification (fails most of the times).

The error snippet is as follows:

{code}
2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
connection Thread[IPC Client (1252749669) connection to 
sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,216 ERROR [UserGroupInformation] PriviledgedActionException 
as:hadoop (auth:SIMPLE) cause:java.io.IOException
2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
retrieve counters. null
java.io.IOException
at 
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
at 
org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
at 
org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
{code}

The connection to 10.202.50.187:47944 is actually the connection to AM; appears 
that we are connecting to AM to get the counters for the successful job and not 
the history server.
 
I'll attach the logs for AM and resource mgr separately, however no unusual 
activity is seen in those.

This makes me suspect that we have a race condition in the code trying to 
access job counters when AM is finishing up and the job hasn't moved to history 
server yet.

  was:
We found this issue during our tests moving from MapReduceV1 to MapReduceV2. A 
few of our applications access job counters multiple times:

a) After submission of job, while job is execution (works fine)

b) Right after job complete notification is received (works fine)

c) Few seconds after job complete notification (fails most of the times).

The error snippet is as follows:

{code}
2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
connection Thread[IPC Client (1252749669) connection to 
sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,216 ERROR [UserGroupInformation] PriviledgedActionException 
as:hadoop (auth:SIMPLE) cause:java.io.IOException
2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
retrieve counters. null
java.io.IOException
at 
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
at 
org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
at 
org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)

[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414024#comment-13414024
 ] 

Alejandro Abdelnur commented on MAPREDUCE-3451:
---

I'm on amending the patch

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4441:


Priority: Blocker  (was: Major)

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4428) A failed job is not available under job history if the job is killed right around the time job is notified as failed

2012-07-13 Thread Rahul Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414017#comment-13414017
 ] 

Rahul Jain commented on MAPREDUCE-4428:
---

OK, will create a separate one for the counter limit exceeded issue.

BTW, I did open MAPREDUCE-4442 for a related issue: we are unable to access job 
counters for the period AM is possibly shutting down as well, it may be a good 
idea to consider that issue in the final fix. 

> A failed job is not available under job history if the job is killed right 
> around the time job is notified as failed 
> -
>
> Key: MAPREDUCE-4428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4428
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, jobtracker
>Affects Versions: 2.0.0-alpha
>Reporter: Rahul Jain
>Assignee: Robert Joseph Evans
> Attachments: am_failed_counter_limits.txt, appMaster_bad.txt, 
> appMaster_good.txt, resrcmgr_bad.txt
>
>
> We have observed this issue consistently running hadoop CDH4 version (based 
> upon 2.0 alpha release):
> In case our hadoop client code gets a notification for a completed job ( 
> using RunningJob object job, with (job.isComplete() && 
> job.isSuccessful()==false)
> the hadoop client code does an unconditional job.killJob() to terminate the 
> job.
> With earlier hadoop versions (verified on hadoop 0.20.2 version), we still  
> have full access to job logs afterwards through hadoop console. However, when 
> using MapReduceV2, the failed hadoop job no longer shows up under jobhistory 
> server. Also, the tracking URL of the job still points to the non-existent 
> Application master http port.
> Once we removed the call to job.killJob() for failed jobs from our hadoop 
> client code, we were able to access the job in job history with mapreduce V2 
> as well. Therefore this appears to be a race condition in the job management 
> wrt. job history for failed jobs.
> We do have the application master and node manager logs collected for this 
> scenario if that'll help isolate the problem and the fix better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414016#comment-13414016
 ] 

Hadoop QA commented on MAPREDUCE-4441:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536446/MR-3451-build-fix.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 9 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2587//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2587//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2587//console

This message is automatically generated.

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4427) Enable the RM to work with AM's that are not managed by it

2012-07-13 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414014#comment-13414014
 ] 

Thomas Graves commented on MAPREDUCE-4427:
--

note - I agree this is good stuff! 

Sorry I meant add it to the client side of the app - which you've answered, 
thanks.

I was thinking of any multi-tenant cluster, where people could start abusing 
the option and running things on gateway machines and overloading those 
gateways or perhaps just causing traffic between outside machines that SEs 
don't want or expect. But thinking about this more, there are plenty of other 
ways to cause issues like that so I'm good with leaving this off. If a use case 
ever comes up we can revisit.

Another question or atleast something to think about - it appears it still goes 
through all the queue checks when submitting the application. I'm wondering if 
some of those checks might not apply in this case - for instance max am 
resources (maxActiveApplications) doesn't really apply because external AM's 
aren't using queue capacity for the AM itself.  That might not be a big issue 
right now if people use this just for debug, but if this is used say by AM's to 
launch other AM's in arbitrary containers it might be more of an issue.  

What happens when you kill one of these applications?  The RM can't really 
force kill it - so does it just kill all containers its requested and "block" 
the AM from communicating.

minor nitpicky comments to consider:
Can we just remove the commented out code in the container executor: +  
//lfs.mkdir(containerDir, null, false);  
Could you also clarify what the comment there means "+  // Without this app 
with no cache files cannot launch tasks"? Is it supposed to be "without this, 
app with"  And then is "this" passing true into lfs.mkdir as last parameter?

> Enable the RM to work with AM's that are not managed by it
> --
>
> Key: MAPREDUCE-4427
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4427
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: mrv2
> Attachments: MAPREDUCE-4427-1.patch, MAPREDUCE-4427-2.patch
>
>
> Currently, the RM itself manages the AM by allocating a container for it and 
> negotiating the launch on the NodeManager and manages the AM lifecycle. 
> Thereafter, the AM negotiates resources with the RM and launches tasks to do 
> the real work.
> It would be a useful improvement to enhance this model by allowing the AM to 
> be launched independently by the client without requiring the RM. These AM's 
> would be launched on a gateway machine that can talk to the cluster. This 
> would open up new use cases such as the following
> 1) Easy debugging of AM, specially during initial development. Having the AM 
> launched on an arbitrary cluster node makes it hard to looks at logs or 
> attach a debugger to the AM. If it can be launched locally then these tasks 
> would be easier.
> 2) Running AM's that need special privileges that may not be available on 
> machines managed by the NodeManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414015#comment-13414015
 ] 

Arun C Murthy commented on MAPREDUCE-3451:
--

Patrick - the last comment/advice was to supress them (from Harsh). Why weren't 
they?

Now, all patch builds are failing complaining about the findbugs warnings... 

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4442) Accessing hadoop counters from a job is unreliable in yarn during in AM process cleanup window

2012-07-13 Thread Rahul Jain (JIRA)
Rahul Jain created MAPREDUCE-4442:
-

 Summary: Accessing hadoop counters from a job is unreliable in 
yarn during in AM process cleanup  window
 Key: MAPREDUCE-4442
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4442
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Rahul Jain


We found this issue during our tests moving from MapReduceV1 to MapReduceV2. A 
few of our applications access job counters multiple times:

a) After submission of job, while job is execution (works fine)

b) Right after job complete notification is received (works fine)

c) Few seconds after job complete notification (fails most of the times).

The error snippet is as follows:

{code}
2012-07-12 19:12:29,039 WARN  [Client] Unexpected error reading responses on 
connection Thread[IPC Client (1252749669) connection to 
sjc1-ciq-ibm-grid07.carrieriq.com/10.202.50.187:47944 from hadoop,5,main]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-07-12 19:12:29,044 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,132 INFO  [ClientServiceDelegate] Application state is 
completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-07-12 19:12:29,216 ERROR [UserGroupInformation] PriviledgedActionException 
as:hadoop (auth:SIMPLE) cause:java.io.IOException
2012-07-12 19:12:29,216 WARN  [BaseOutputStageJob] getJobCounters: Unable to 
retrieve counters. null
java.io.IOException
at 
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:315)
at 
org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:335)
at 
org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:470)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:719)
at org.apache.hadoop.mapreduce.Job$8.run(Job.java:716)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:716)
at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.getCounters(JobClient.java:396)
{code}

 
I'll attach the logs for AM and resource mgr separately, however no unusual 
activity is seen in those.

This makes me suspect that we have a race condition in the code trying to 
access job counters when AM is finishing up and the job hasn't moved to history 
server yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4283) Display tail of aggregated logs by default

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414011#comment-13414011
 ] 

Robert Joseph Evans commented on MAPREDUCE-4283:


Jason,

There are some issues with how you seek in the file.

InputStream.available is only supposed to return the value that is left and 
will not block.  It looks like you are using it to try and read to the end of 
the BoundedInputStream, but I am not sure that it is guaranteed to work that 
way.

> Display tail of aggregated logs by default
> --
>
> Key: MAPREDUCE-4283
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4283
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.3, 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4283.patch
>
>
> Similar to the manner in which the nodemanager webUI displays container logs, 
> it would be very useful if the historyserver showed the trailing 4K or so of 
> the aggregated logs with a link to see the full log.
> When debugging issues the relevant errors are usually at the end of the log, 
> so showing just the last few K can enable quick diagnosis without waiting for 
> what can be many megabytes of log data to download. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4441:


Fix Version/s: 2.0.1-alpha
Affects Version/s: 2.0.0-alpha
   Status: Patch Available  (was: Open)

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.0.1-alpha
>
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4441:
---

Assignee: Karthik Kambatla

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4441:


Attachment: MR-3451-build-fix.patch

Uploading a patch to fix FS build issue introduced by MR-3451.

> Fix build issue caused by MR-3451
> -
>
> Key: MAPREDUCE-4441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Karthik Kambatla
> Attachments: MR-3451-build-fix.patch
>
>
> TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4441) Fix build issue caused by MR-3451

2012-07-13 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-4441:
---

 Summary: Fix build issue caused by MR-3451
 Key: MAPREDUCE-4441
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4441
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Karthik Kambatla
 Attachments: MR-3451-build-fix.patch

TestFSSchedulerApp is in the wrong package and missing some imports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Patrick Wendell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413998#comment-13413998
 ] 

Patrick Wendell commented on MAPREDUCE-3451:


It's a one line change to the package header. The findbugs issues are discussed 
further up in this Jira and were also discussed during the last round of 
reviews.

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2

2012-07-13 Thread Patrick Wendell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413997#comment-13413997
 ] 

Patrick Wendell commented on MAPREDUCE-3451:


Yes the error is related to the last minute movement of that test into the Fair 
package.

Karthik - could you quickly patch this so we can get the build stabilized? If 
you can fix the findbugs quickly that would be great too - as i said earlier 
these are false positives but we might be able to coerce findbugs into not 
spouting warnings.

> Port Fair Scheduler to MR2
> --
>
> Key: MAPREDUCE-3451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, scheduler
>Reporter: Patrick Wendell
>Assignee: Patrick Wendell
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-3451.v1.patch.txt, 
> MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt, 
> MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch, 
> MAPREDUCE-3451.v6.patch, MAPREDUCE-3451.v7.patch, MAPREDUCE-3451.v8.patch, 
> MAPREDUCE-3451.v9.patch
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet 
> ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and 
> will be updated to include design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4157) ResourceManager should not kill apps that are well behaved

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413994#comment-13413994
 ] 

Robert Joseph Evans commented on MAPREDUCE-4157:


I looked through the new patch and I am still a +1 on this change.

> ResourceManager should not kill apps that are well behaved
> --
>
> Key: MAPREDUCE-4157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4157
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-4157.patch, MAPREDUCE-4157.patch
>
>
> Currently when the ApplicationMaster unregisters with the ResourceManager, 
> the RM kills (via the NMs) all the active containers for an application.  
> This introduces a race where the AM may be trying to clean up and may not 
> finish before it is killed.  The RM should give the AM a chance to exit 
> cleanly on its own rather than always race with a pending kill on shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4395) Possible NPE at ClientDistributedCacheManager#determineTimestamps

2012-07-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413979#comment-13413979
 ] 

Robert Joseph Evans commented on MAPREDUCE-4395:


Looking at the patch it looks OK to me.  I looked and the only other thing that 
uses this API is streaming, when it is setting up a JobConf, so I am OK with it 
blowing up when the URI is not valid.  My only comment is that I would like the 
Javadocs for this method updated to explain what it does and also indicate what 
happens in the failure case.  because well

{code}
   /**
*
* @param str
*/
{code}

is completely useless.  

> Possible NPE at ClientDistributedCacheManager#determineTimestamps
> -
>
> Key: MAPREDUCE-4395
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4395
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, job submission, mrv2
>Affects Versions: 2.0.0-alpha, trunk
>Reporter: Bhallamudi Venkata Siva Kamesh
>Assignee: Bhallamudi Venkata Siva Kamesh
>Priority: Critical
> Attachments: MAPREDUCE-4395.patch
>
>
> {code:title=ClientDistributedCacheManager#determineTimestamps|borderStyle=solid}
> URI[] tfiles = DistributedCache.getCacheFiles(job);
> {code}
> It may be possible that tfiles array contains *null* as it's entry, and 
> subsequently leads to NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4422) YARN_APPLICATION_CLASSPATH needs a documented default value in YarnConfiguration

2012-07-13 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413970#comment-13413970
 ] 

Ahmed Radwan commented on MAPREDUCE-4422:
-

Thanks Arun for the clarifications! I have already submitted a new patch 
yesterday incorporating your comments. Please let me know if you have any other 
comments. 

> YARN_APPLICATION_CLASSPATH needs a documented default value in 
> YarnConfiguration
> 
>
> Key: MAPREDUCE-4422
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4422
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4422.patch, MAPREDUCE-4422_rev2.patch, 
> MAPREDUCE-4422_rev3.patch, MAPREDUCE-4422_rev3.patch
>
>
> MAPREDUCE-3505 allowed YARN_APPLICATION_CLASSPATH to be configurable.
> However, we didn't add a default value to YarnConfiguration, as-is the norm.
> Ran into it while investigating MAPREDUCE-4421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4415) Backport the Job.getInstance methods from MAPREDUCE-1505 to branch-1

2012-07-13 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413962#comment-13413962
 ] 

Harsh J commented on MAPREDUCE-4415:


Arun - Ping?

> Backport the Job.getInstance methods from MAPREDUCE-1505 to branch-1
> 
>
> Key: MAPREDUCE-4415
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4415
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 1.0.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: MAPREDUCE-4415.patch
>
>
> In 2.x MR, the Job constructors have all been deprecated in favor of 
> Job.getInstance() calls to get a Job object.
> However, these getInstance methods do not appear to be present in the 1.x MR 
> API, and thereby may cause additional pain to users moving from 1.x to 2.x 
> going forward.
> This patch proposes to add in the getInstance style of methods with suitable 
> test coverage for both style of constructors, while not pulling in anything 
> else from MAPREDUCE-1505 (as we lack 'Cluster' in 1.x). As we're not going to 
> be deprecating the regular ctors in a 1.x release, this is not an 
> incompatible change in any way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4405) Adding test case for HierarchicalQueue in TestJobQueueClient

2012-07-13 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-4405:
---

Fix Version/s: 0.22.1
 Hadoop Flags: Reviewed

I just committed this to branch 0.22.1. Thank you Mayank.

> Adding test case for HierarchicalQueue in TestJobQueueClient
> 
>
> Key: MAPREDUCE-4405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4405
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-4405-22-v2.patch, MAPREDUCE-4405-22.patch
>
>
> Adding test case for HierarchicalQueue in TestJobQueueClient

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4405) Adding test case for HierarchicalQueue in TestJobQueueClient

2012-07-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413948#comment-13413948
 ] 

Konstantin Shvachko commented on MAPREDUCE-4405:


+1 looks good.

> Adding test case for HierarchicalQueue in TestJobQueueClient
> 
>
> Key: MAPREDUCE-4405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4405
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Attachments: MAPREDUCE-4405-22-v2.patch, MAPREDUCE-4405-22.patch
>
>
> Adding test case for HierarchicalQueue in TestJobQueueClient

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4405) Adding test case for HierarchicalQueue in TestJobQueueClient

2012-07-13 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4405:
-

Attachment: MAPREDUCE-4405-22-v2.patch

Incorporating Konstantin's comment

Thanks,
Mayank

> Adding test case for HierarchicalQueue in TestJobQueueClient
> 
>
> Key: MAPREDUCE-4405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4405
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Attachments: MAPREDUCE-4405-22-v2.patch, MAPREDUCE-4405-22.patch
>
>
> Adding test case for HierarchicalQueue in TestJobQueueClient

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4405) Adding test case for HierarchicalQueue in TestJobQueueClient

2012-07-13 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4405:
-

Issue Type: Improvement  (was: Bug)

> Adding test case for HierarchicalQueue in TestJobQueueClient
> 
>
> Key: MAPREDUCE-4405
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4405
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
>Priority: Minor
> Attachments: MAPREDUCE-4405-22-v2.patch, MAPREDUCE-4405-22.patch
>
>
> Adding test case for HierarchicalQueue in TestJobQueueClient

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >