[jira] [Commented] (YARN-178) Fix custom ProcessTree instance creation

2012-10-23 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482247#comment-13482247
 ] 

Bikas Saha commented on YARN-178:
-

Then why add it to the constructor of the abstract base class?

 Fix custom ProcessTree instance creation
 

 Key: YARN-178
 URL: https://issues.apache.org/jira/browse/YARN-178
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.5
Reporter: Radim Kolar
Assignee: Radim Kolar
Priority: Critical
 Attachments: pstree-instance2.txt, pstree-instance.txt


 1. In current pluggable resourcecalculatorprocesstree is not passed root 
 process id to custom implementation making it unusable.
 2. pstree do not extend Configured as it should
 Added constructor with pid argument with testsuite. Also added test that 
 pstree is correctly configured.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-174) TestNodeStatusUpdater is failing in trunk

2012-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482251#comment-13482251
 ] 

Hudson commented on YARN-174:
-

Integrated in Hadoop-Yarn-trunk #12 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/12/])
YARN-174. Modify NodeManager to pass the user's configuration even when 
rebooting. Contributed by Vinod Kumar Vavilapalli. (Revision 1401086)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1401086
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java


 TestNodeStatusUpdater is failing in trunk
 -

 Key: YARN-174
 URL: https://issues.apache.org/jira/browse/YARN-174
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Robert Joseph Evans
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.2-alpha, 0.23.5

 Attachments: YARN-174-20121022.txt, YARN-174.patch


 {noformat}
 2012-10-19 12:18:23,941 FATAL [Node Status Updater] nodemanager.NodeManager 
 (NodeManager.java:initAndStartNodeManager(277)) - Error starting NodeManager
 org.apache.hadoop.yarn.YarnException: ${yarn.log.dir}/userlogs is not a valid 
 path. Path should be with file scheme or without scheme
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.validatePaths(LocalDirsHandlerService.java:321)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService$MonitoringTimerTask.init(LocalDirsHandlerService.java:95)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.init(LocalDirsHandlerService.java:123)
 at 
 org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.init(NodeHealthCheckerService.java:48)
 at 
 org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.init(NodeManager.java:165)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:274)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.stateChanged(NodeManager.java:256)
 at 
 org.apache.hadoop.yarn.service.AbstractService.changeState(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.service.AbstractService.stop(AbstractService.java:112)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.stop(NodeStatusUpdaterImpl.java:149)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.reboot(NodeStatusUpdaterImpl.java:157)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.access$900(NodeStatusUpdaterImpl.java:63)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:357)
 {noformat}
 The NM then calls System.exit(-1), which makes the unit test exit and 
 produces an error that is hard to track down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-174) TestNodeStatusUpdater is failing in trunk

2012-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482336#comment-13482336
 ] 

Hudson commented on YARN-174:
-

Integrated in Hadoop-Mapreduce-trunk #1234 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1234/])
YARN-174. Modify NodeManager to pass the user's configuration even when 
rebooting. Contributed by Vinod Kumar Vavilapalli. (Revision 1401086)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1401086
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java


 TestNodeStatusUpdater is failing in trunk
 -

 Key: YARN-174
 URL: https://issues.apache.org/jira/browse/YARN-174
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Robert Joseph Evans
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.2-alpha, 0.23.5

 Attachments: YARN-174-20121022.txt, YARN-174.patch


 {noformat}
 2012-10-19 12:18:23,941 FATAL [Node Status Updater] nodemanager.NodeManager 
 (NodeManager.java:initAndStartNodeManager(277)) - Error starting NodeManager
 org.apache.hadoop.yarn.YarnException: ${yarn.log.dir}/userlogs is not a valid 
 path. Path should be with file scheme or without scheme
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.validatePaths(LocalDirsHandlerService.java:321)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService$MonitoringTimerTask.init(LocalDirsHandlerService.java:95)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.init(LocalDirsHandlerService.java:123)
 at 
 org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.init(NodeHealthCheckerService.java:48)
 at 
 org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.init(NodeManager.java:165)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:274)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.stateChanged(NodeManager.java:256)
 at 
 org.apache.hadoop.yarn.service.AbstractService.changeState(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.service.AbstractService.stop(AbstractService.java:112)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.stop(NodeStatusUpdaterImpl.java:149)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.reboot(NodeStatusUpdaterImpl.java:157)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.access$900(NodeStatusUpdaterImpl.java:63)
 at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:357)
 {noformat}
 The NM then calls System.exit(-1), which makes the unit test exit and 
 produces an error that is hard to track down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-177) CapacityScheduler - adding a queue while the RM is running has wacky results

2012-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482353#comment-13482353
 ] 

Hadoop QA commented on YARN-177:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550459/YARN-177.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/117//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/117//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/117//console

This message is automatically generated.

 CapacityScheduler - adding a queue while the RM is running has wacky results
 

 Key: YARN-177
 URL: https://issues.apache.org/jira/browse/YARN-177
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-177.patch, YARN-177.patch


 Adding a queue to the capacity scheduler while the RM is running and then 
 running a job in the queue added results in very strange behavior.  The 
 cluster Total Memory can either decrease or increase.  We had a cluster where 
 total memory decreased to almost 1/6th the capacity. Running on a small test 
 cluster resulted in the capacity going up by simply adding a queue and 
 running wordcount.  
 Looking at the RM logs, used memory can go negative but other logs show the 
 number positive:
 2012-10-21 22:56:44,796 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 assignedContainer queue=root usedCapacity=0.0375 absoluteUsedCapacity=0.0375 
 used=memory: 7680 cluster=memory: 204800
 2012-10-21 22:56:45,831 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 completedContainer queue=root usedCapacity=-0.0225 
 absoluteUsedCapacity=-0.0225 used=memory: -4608 cluster=memory: 204800
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-177) CapacityScheduler - adding a queue while the RM is running has wacky results

2012-10-23 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482397#comment-13482397
 ] 

Thomas Graves commented on YARN-177:


The LeafQueue has a setParentQueue that is unused and can be removed now.

 CapacityScheduler - adding a queue while the RM is running has wacky results
 

 Key: YARN-177
 URL: https://issues.apache.org/jira/browse/YARN-177
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-177.patch, YARN-177.patch


 Adding a queue to the capacity scheduler while the RM is running and then 
 running a job in the queue added results in very strange behavior.  The 
 cluster Total Memory can either decrease or increase.  We had a cluster where 
 total memory decreased to almost 1/6th the capacity. Running on a small test 
 cluster resulted in the capacity going up by simply adding a queue and 
 running wordcount.  
 Looking at the RM logs, used memory can go negative but other logs show the 
 number positive:
 2012-10-21 22:56:44,796 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 assignedContainer queue=root usedCapacity=0.0375 absoluteUsedCapacity=0.0375 
 used=memory: 7680 cluster=memory: 204800
 2012-10-21 22:56:45,831 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 completedContainer queue=root usedCapacity=-0.0225 
 absoluteUsedCapacity=-0.0225 used=memory: -4608 cluster=memory: 204800
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-178) Fix custom ProcessTree instance creation

2012-10-23 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482491#comment-13482491
 ] 

Robert Joseph Evans commented on YARN-178:
--

Makes since.  Bikas, unless you have a strong objection I will check this in 
this afternoon.

 Fix custom ProcessTree instance creation
 

 Key: YARN-178
 URL: https://issues.apache.org/jira/browse/YARN-178
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.5
Reporter: Radim Kolar
Assignee: Radim Kolar
Priority: Critical
 Attachments: pstree-instance2.txt, pstree-instance.txt


 1. In current pluggable resourcecalculatorprocesstree is not passed root 
 process id to custom implementation making it unusable.
 2. pstree do not extend Configured as it should
 Added constructor with pid argument with testsuite. Also added test that 
 pstree is correctly configured.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-177) CapacityScheduler - adding a queue while the RM is running has wacky results

2012-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482518#comment-13482518
 ] 

Hadoop QA commented on YARN-177:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550499/YARN-177.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/118//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/118//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/118//console

This message is automatically generated.

 CapacityScheduler - adding a queue while the RM is running has wacky results
 

 Key: YARN-177
 URL: https://issues.apache.org/jira/browse/YARN-177
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-177.patch, YARN-177.patch, YARN-177.patch


 Adding a queue to the capacity scheduler while the RM is running and then 
 running a job in the queue added results in very strange behavior.  The 
 cluster Total Memory can either decrease or increase.  We had a cluster where 
 total memory decreased to almost 1/6th the capacity. Running on a small test 
 cluster resulted in the capacity going up by simply adding a queue and 
 running wordcount.  
 Looking at the RM logs, used memory can go negative but other logs show the 
 number positive:
 2012-10-21 22:56:44,796 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 assignedContainer queue=root usedCapacity=0.0375 absoluteUsedCapacity=0.0375 
 used=memory: 7680 cluster=memory: 204800
 2012-10-21 22:56:45,831 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 completedContainer queue=root usedCapacity=-0.0225 
 absoluteUsedCapacity=-0.0225 used=memory: -4608 cluster=memory: 204800
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-177) CapacityScheduler - adding a queue while the RM is running has wacky results

2012-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482563#comment-13482563
 ] 

Hadoop QA commented on YARN-177:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550506/YARN-177.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/119//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/119//console

This message is automatically generated.

 CapacityScheduler - adding a queue while the RM is running has wacky results
 

 Key: YARN-177
 URL: https://issues.apache.org/jira/browse/YARN-177
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-177.patch, YARN-177.patch, YARN-177.patch, 
 YARN-177.patch


 Adding a queue to the capacity scheduler while the RM is running and then 
 running a job in the queue added results in very strange behavior.  The 
 cluster Total Memory can either decrease or increase.  We had a cluster where 
 total memory decreased to almost 1/6th the capacity. Running on a small test 
 cluster resulted in the capacity going up by simply adding a queue and 
 running wordcount.  
 Looking at the RM logs, used memory can go negative but other logs show the 
 number positive:
 2012-10-21 22:56:44,796 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 assignedContainer queue=root usedCapacity=0.0375 absoluteUsedCapacity=0.0375 
 used=memory: 7680 cluster=memory: 204800
 2012-10-21 22:56:45,831 [ResourceManager Event Processor] INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
 completedContainer queue=root usedCapacity=-0.0225 
 absoluteUsedCapacity=-0.0225 used=memory: -4608 cluster=memory: 204800
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-180) Capacity scheduler - containers that get reserved create container token to early

2012-10-23 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482589#comment-13482589
 ] 

Robert Joseph Evans commented on YARN-180:
--

The patch looks mostly good. I am a bit confused by {code}if (containerToken == 
null) {
  containerToken = null; // Try again later.
}
{code} inside the new createContainerToken method. It is a copy and paste from 
before, but not needed any more.

Other then that it looks good. Since Arun is on a plane now I will upload a new 
patch.

 Capacity scheduler - containers that get reserved create container token to 
 early
 -

 Key: YARN-180
 URL: https://issues.apache.org/jira/browse/YARN-180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-180.patch, YARN-180.patch


 The capacity scheduler has the ability to 'reserve' containers.  
 Unfortunately before it decides that it goes to reserved rather then 
 assigned, the Container object is created which creates a container token 
 that expires in roughly 10 minutes by default.  
 This means that by the time the NM frees up enough space on that node for the 
 container to move to assigned the container token may have expired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-180) Capacity scheduler - containers that get reserved create container token to early

2012-10-23 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482591#comment-13482591
 ] 

Robert Joseph Evans commented on YARN-180:
--

Oh I noticed that the containerToken is never assigned anyways.  I will fix 
that too.

 Capacity scheduler - containers that get reserved create container token to 
 early
 -

 Key: YARN-180
 URL: https://issues.apache.org/jira/browse/YARN-180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-180.patch, YARN-180.patch


 The capacity scheduler has the ability to 'reserve' containers.  
 Unfortunately before it decides that it goes to reserved rather then 
 assigned, the Container object is created which creates a container token 
 that expires in roughly 10 minutes by default.  
 This means that by the time the NM frees up enough space on that node for the 
 container to move to assigned the container token may have expired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-180) Capacity scheduler - containers that get reserved create container token to early

2012-10-23 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated YARN-180:
-

Attachment: YARN-180.patch

 Capacity scheduler - containers that get reserved create container token to 
 early
 -

 Key: YARN-180
 URL: https://issues.apache.org/jira/browse/YARN-180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-180.patch, YARN-180.patch, YARN-180.patch


 The capacity scheduler has the ability to 'reserve' containers.  
 Unfortunately before it decides that it goes to reserved rather then 
 assigned, the Container object is created which creates a container token 
 that expires in roughly 10 minutes by default.  
 This means that by the time the NM frees up enough space on that node for the 
 container to move to assigned the container token may have expired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-180) Capacity scheduler - containers that get reserved create container token to early

2012-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482611#comment-13482611
 ] 

Hadoop QA commented on YARN-180:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550518/YARN-180.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/121//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/121//console

This message is automatically generated.

 Capacity scheduler - containers that get reserved create container token to 
 early
 -

 Key: YARN-180
 URL: https://issues.apache.org/jira/browse/YARN-180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-180.patch, YARN-180.patch, YARN-180.patch


 The capacity scheduler has the ability to 'reserve' containers.  
 Unfortunately before it decides that it goes to reserved rather then 
 assigned, the Container object is created which creates a container token 
 that expires in roughly 10 minutes by default.  
 This means that by the time the NM frees up enough space on that node for the 
 container to move to assigned the container token may have expired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-183) Clean up fair scheduler code

2012-10-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-183:
---

 Summary: Clean up fair scheduler code
 Key: YARN-183
 URL: https://issues.apache.org/jira/browse/YARN-183
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Minor


The fair scheduler code has a bunch of minor stylistic issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-167) AM stuck in KILL_WAIT for days

2012-10-23 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482716#comment-13482716
 ] 

Robert Joseph Evans commented on YARN-167:
--

I am rather nervous about back porting MAPREDUCE-3353.  It is a major feature 
that has a significant footprint and was not all that stable when it first went 
in.  I know that it has since stabilized but I am still nervous about such a 
large change. It seems like it would be simpler to handle the KILL events in 
the states that missed it.

 AM stuck in KILL_WAIT for days
 --

 Key: YARN-167
 URL: https://issues.apache.org/jira/browse/YARN-167
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli
 Attachments: TaskAttemptStateGraph.jpg


 We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them 
 as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a 
 few maps running. All these maps were scheduled on nodes which are now in the 
 RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482751#comment-13482751
 ] 

Vinod Kumar Vavilapalli commented on YARN-147:
--

Got lost between YARN-3 and YARN-147 :)

Very nice to have patch, I am willing to get it in ASAP given the amount of 
time it's been around.

Some comments below.
 - yarn.nodemanager.linux-container-executor.cgroups.mount has different 
defaults in code and in yarn-default.xml
 - If the configs can be done away with (see below), ignore this comment. The 
descriptions for all the new configs in yarn-default.xml heavily reference 
code. We should simplify them to not address code and instead make them 
understandable by users and cross reference other related parameters.
 - {code}// Based on testing, ApplicationMaster executables don't terminate 
until
// a little after the container appears to have finished. Therefore, we
// wait a short bit for the cgroup to become empty before deleting it.
   {code}
   Can you explain this? Is this sleep necessary. Depending on its importance, 
we'll need to fix the following Id check, AMs don't always have ID equaling one.
 - container-executor.c: If a mount-point is already mounted, mount gives a 
EBUSY error, mount_cgroup() will need to be fixed to support remounts (for e.g. 
on NM restarts). We could unmount cgroup fs on shutdown but that isn't always 
guaranteed.
 - Please update if you have tested it on a secure setup with LCE enabled with 
and without cgroups.

The following are already raised by others in some way, but I don't see them 
fixed in the latest patch. Unless I am missing something:
 - Not sure of the benefit of configurable 
yarn.nodemanager.linux-container-executor.cgroups.mount-path. Couldn't NM just 
always mount to a path that it creates and owns? Similar comment for the 
hierarchy-prefix.
 - CgroupsLCEResourcesHandler is swallowing exceptions and errors in multiple 
places - updateCgroup() and createCgroup(). In the later, if cgroups are 
enabled, and we can't create the file, it is a critical error?

One overarching improvement worth pursing immediately, either now or in follow 
up tickets:
 - Make ResourcesHandler top level. I'd like to merge the ContainersMonitor 
functionality with this so as to monitor/enforce memory limits also. 
ContainersMinotor is top-level, we should make ResourcesHandler also top-level 
so that other platforms don't need to create this type-hierarchy all over again 
when they wish to implement some or all of this functionality.

 Add support for CPU isolation/monitoring of containers
 --

 Key: YARN-147
 URL: https://issues.apache.org/jira/browse/YARN-147
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Andrew Ferguson
 Fix For: 2.0.3-alpha

 Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, 
 YARN-147-v4.patch, YARN-147-v5.patch, YARN-3.patch


 This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not 
 show the SUBMIT PATCH button.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-181) capacity-scheduler.xml move breaks Eclipse import

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-181:
-

Summary: capacity-scheduler.xml move breaks Eclipse import  (was: 
capacity-scheduler.cfg move breaks Eclipse import)

I am sure you meant capacity-scheduler.xml

 capacity-scheduler.xml move breaks Eclipse import
 -

 Key: YARN-181
 URL: https://issues.apache.org/jira/browse/YARN-181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Critical
 Attachments: YARN181_jenkins.txt, YARN181_postSvnMv.txt, 
 YARN181_svn_mv.sh


 Eclipse doesn't seem to handle testResources which resolve to an absolute 
 path. YARN-140 moved capacity-scheduler.cfg a couple of levels up to the 
 hadoop-yarn project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-184) Remove unnecessary locking in fair scheduler, and address findbugs.

2012-10-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-184:
---

 Summary: Remove unnecessary locking in fair scheduler, and address 
findbugs.
 Key: YARN-184
 URL: https://issues.apache.org/jira/browse/YARN-184
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Sandy Ryza
Assignee: Sandy Ryza


In YARN-12, locks were added to all fields of QueueManager to address findbugs. 
 In addition, findbugs exclusions were added in response to MAPREDUCE-4439, 
without a deep look at the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-181) capacity-scheduler.xml move breaks Eclipse import

2012-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482775#comment-13482775
 ] 

Hudson commented on YARN-181:
-

Integrated in Hadoop-trunk-Commit #2919 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2919/])
YARN-181. Fixed eclipse settings broken by capacity-scheduler.xml move via 
YARN-140. Contributed by Siddharth Seth. (Revision 1401504)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1401504
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/capacity-scheduler.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml


 capacity-scheduler.xml move breaks Eclipse import
 -

 Key: YARN-181
 URL: https://issues.apache.org/jira/browse/YARN-181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: YARN181_jenkins.txt, YARN181_postSvnMv.txt, 
 YARN181_svn_mv.sh


 Eclipse doesn't seem to handle testResources which resolve to an absolute 
 path. YARN-140 moved capacity-scheduler.cfg a couple of levels up to the 
 hadoop-yarn project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-167) AM stuck in KILL_WAIT for days

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482778#comment-13482778
 ] 

Vinod Kumar Vavilapalli commented on YARN-167:
--

bq. Afterwards, the Task Attempt transitions from SUCCESS_CONTAINER_CLEANUP to 
SUCCEEDED. In either of these states TA_KILL is ignored. So the Task stays in 
KILL_WAIT and consequently the Job too.
This is fine. Job waits for all tasks and taskAttempts to 'finish', not just 
killed. In this case, TA will succeed and inform the job about the same, so 
that the job doesn't wait for this task anymore.

bq. I am rather nervous about back porting MAPREDUCE-3353. It is a major 
feature that has a significant footprint and was not all that stable when it 
first went in. I know that it has since stabilized but I am still nervous about 
such a large change.
Understand that it is a big change, but if we want to address this issue, we 
need that patch. Given MAPREDUCE-3353 is hardened on trunk, we should 
considering pulling it in into 0.23. 

bq. It seems like it would be simpler to handle the KILL events in the states 
that missed it.
There isn't anything like a missed state that is causing this issue if I 
understand Ravi's issue description correctly.

 AM stuck in KILL_WAIT for days
 --

 Key: YARN-167
 URL: https://issues.apache.org/jira/browse/YARN-167
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli
 Attachments: TaskAttemptStateGraph.jpg


 We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them 
 as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a 
 few maps running. All these maps were scheduled on nodes which are now in the 
 RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-182) Container killed by the ApplicationMaster

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-182:
-

Attachment: Log.txt

Attaching log.

 Container killed by the ApplicationMaster
 -

 Key: YARN-182
 URL: https://issues.apache.org/jira/browse/YARN-182
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.1-alpha
Reporter: zhengqiu cai
  Labels: hadoop
 Attachments: Log.txt


 I was running wordcount and the resourcemanager web UI shown the status as 
 FINISHED SUCCEEDED, but the log shown Container killed by the 
 ApplicationMaster

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-182) Unnecessary Container killed by the ApplicationMaster message for successful containers

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-182:
-

Summary: Unnecessary Container killed by the ApplicationMaster message 
for successful containers  (was: Container killed by the ApplicationMaster)

 Unnecessary Container killed by the ApplicationMaster message for 
 successful containers
 -

 Key: YARN-182
 URL: https://issues.apache.org/jira/browse/YARN-182
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.1-alpha
Reporter: zhengqiu cai
  Labels: hadoop
 Attachments: Log.txt


 I was running wordcount and the resourcemanager web UI shown the status as 
 FINISHED SUCCEEDED, but the log shown Container killed by the 
 ApplicationMaster

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-181) capacity-scheduler.xml move breaks Eclipse import

2012-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482811#comment-13482811
 ] 

Hudson commented on YARN-181:
-

Integrated in Hadoop-Yarn-trunk #13 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/13/])
YARN-181. Fixed eclipse settings broken by capacity-scheduler.xml move via 
YARN-140. Contributed by Siddharth Seth. (Revision 1401504)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1401504
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/capacity-scheduler.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml


 capacity-scheduler.xml move breaks Eclipse import
 -

 Key: YARN-181
 URL: https://issues.apache.org/jira/browse/YARN-181
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: YARN181_jenkins.txt, YARN181_postSvnMv.txt, 
 YARN181_svn_mv.sh


 Eclipse doesn't seem to handle testResources which resolve to an absolute 
 path. YARN-140 moved capacity-scheduler.cfg a couple of levels up to the 
 hadoop-yarn project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-179) Bunch of test failures on trunk

2012-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482814#comment-13482814
 ] 

Hudson commented on YARN-179:
-

Integrated in Hadoop-Yarn-trunk #13 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/13/])
YARN-179. Fix some unit test failures. (Contributed by Vinod Kumar 
Vavilapalli) (Revision 1401481)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1401481
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/pom.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/main/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/UnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


 Bunch of test failures on trunk
 ---

 Key: YARN-179
 URL: https://issues.apache.org/jira/browse/YARN-179
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 2.0.3-alpha

 Attachments: YARN-179-20121022.3.txt, YARN-179-20121022.4.txt


 {{CapacityScheduler.setConf()}} mandates a YarnConfiguration. It doesn't need 
 to, throughout all of YARN, components only depend on Configuration and 
 depend on the callers to provide correct configuration.
 This is causing multiple tests to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-184) Remove unnecessary locking in fair scheduler, and address findbugs excludes.

2012-10-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-184:


Summary: Remove unnecessary locking in fair scheduler, and address findbugs 
excludes.  (was: Remove unnecessary locking in fair scheduler, and address 
findbugs.)

 Remove unnecessary locking in fair scheduler, and address findbugs excludes.
 

 Key: YARN-184
 URL: https://issues.apache.org/jira/browse/YARN-184
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 In YARN-12, locks were added to all fields of QueueManager to address 
 findbugs.  In addition, findbugs exclusions were added in response to 
 MAPREDUCE-4439, without a deep look at the code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-167) AM stuck in KILL_WAIT for days

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482853#comment-13482853
 ] 

Vinod Kumar Vavilapalli commented on YARN-167:
--

bq. There isn't anything like a missed state that is causing this issue if I 
understand Ravi's issue description correctly.
Obviously, this could be wrong.

Ravi, if you have one of these stuck AMs lying around, can you take a thread 
dump please?

 AM stuck in KILL_WAIT for days
 --

 Key: YARN-167
 URL: https://issues.apache.org/jira/browse/YARN-167
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Ravi Prakash
Assignee: Vinod Kumar Vavilapalli
 Attachments: TaskAttemptStateGraph.jpg


 We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them 
 as RUNNING. When you go to the AM, it shows it in the KILL_WAIT state, and a 
 few maps running. All these maps were scheduled on nodes which are now in the 
 RM's Lost nodes list. The running maps are in the FAIL_CONTAINER_CLEANUP state

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-139) Interrupted Exception within AsyncDispatcher leads to user confusion

2012-10-23 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-139:
-

Attachment: YARN-139-20121023.txt

Thanks Jason, was thinking of fixing that separately but here it goes.

 Interrupted Exception within AsyncDispatcher leads to user confusion
 

 Key: YARN-139
 URL: https://issues.apache.org/jira/browse/YARN-139
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Nathan Roberts
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-139-20121019.1.txt, YARN-139-20121019.txt, 
 YARN-139-20121023.txt, YARN-139.txt


 Successful applications tend to get InterruptedExceptions during shutdown. 
 The exception is harmless but it leads to lots of user confusion and 
 therefore could be cleaned up.
 2012-09-28 14:50:12,477 WARN [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Interrupted Exception while 
 stopping
 java.lang.InterruptedException
   at java.lang.Object.wait(Native Method)
   at java.lang.Thread.join(Thread.java:1143)
   at java.lang.Thread.join(Thread.java:1196)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.stop(AsyncDispatcher.java:105)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:437)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:402)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:619)
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped.
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.service.AbstractService: 
 Service:org.apache.hadoop.mapreduce.v2.app.MRAppMaster is stopped.
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-139) Interrupted Exception within AsyncDispatcher leads to user confusion

2012-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482876#comment-13482876
 ] 

Hadoop QA commented on YARN-139:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550561/YARN-139-20121023.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/123//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/123//console

This message is automatically generated.

 Interrupted Exception within AsyncDispatcher leads to user confusion
 

 Key: YARN-139
 URL: https://issues.apache.org/jira/browse/YARN-139
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 2.0.2-alpha, 0.23.4
Reporter: Nathan Roberts
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-139-20121019.1.txt, YARN-139-20121019.txt, 
 YARN-139-20121023.txt, YARN-139.txt


 Successful applications tend to get InterruptedExceptions during shutdown. 
 The exception is harmless but it leads to lots of user confusion and 
 therefore could be cleaned up.
 2012-09-28 14:50:12,477 WARN [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.event.AsyncDispatcher: Interrupted Exception while 
 stopping
 java.lang.InterruptedException
   at java.lang.Object.wait(Native Method)
   at java.lang.Thread.join(Thread.java:1143)
   at java.lang.Thread.join(Thread.java:1196)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.stop(AsyncDispatcher.java:105)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
   at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:437)
   at 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:402)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:619)
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped.
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.yarn.service.AbstractService: 
 Service:org.apache.hadoop.mapreduce.v2.app.MRAppMaster is stopped.
 2012-09-28 14:50:12,477 INFO [AsyncDispatcher event handler] 
 org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-180) Capacity scheduler - containers that get reserved create container token to early

2012-10-23 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482887#comment-13482887
 ] 

Thomas Graves commented on YARN-180:


+1 for latest patch.  I manually tested this on a small cluster and verified 
that a container can be reserved for  10 minutes and the AM can still start 
the container after finally being allocated it.

 Capacity scheduler - containers that get reserved create container token to 
 early
 -

 Key: YARN-180
 URL: https://issues.apache.org/jira/browse/YARN-180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Arun C Murthy
Priority: Critical
 Fix For: 2.0.3-alpha, 0.23.5

 Attachments: YARN-180-branch_0.23.patch, YARN-180.patch, 
 YARN-180.patch, YARN-180.patch


 The capacity scheduler has the ability to 'reserve' containers.  
 Unfortunately before it decides that it goes to reserved rather then 
 assigned, the Container object is created which creates a container token 
 that expires in roughly 10 minutes by default.  
 This means that by the time the NM frees up enough space on that node for the 
 container to move to assigned the container token may have expired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-178) Fix custom ProcessTree instance creation

2012-10-23 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482917#comment-13482917
 ] 

Bikas Saha commented on YARN-178:
-

Sure. Please go ahead.

 Fix custom ProcessTree instance creation
 

 Key: YARN-178
 URL: https://issues.apache.org/jira/browse/YARN-178
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.5
Reporter: Radim Kolar
Assignee: Radim Kolar
Priority: Critical
 Attachments: pstree-instance2.txt, pstree-instance.txt


 1. In current pluggable resourcecalculatorprocesstree is not passed root 
 process id to custom implementation making it unusable.
 2. pstree do not extend Configured as it should
 Added constructor with pid argument with testsuite. Also added test that 
 pstree is correctly configured.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira