date:20120627

[jira] [Commented] (MAPREDUCE-4338) NodeManager daemon is failing to start.

2012-06-27 Thread srikanth ayalasomayajulu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402885#comment-13402885
 ] 

srikanth ayalasomayajulu commented on MAPREDUCE-4338:
-

I disabled the firewall and made the port open, but still the nodemanager is 
not starting on the slave machines. Please help me as it is obstructing my work 
severely. 

> NodeManager daemon is failing to start.
> ---
>
> Key: MAPREDUCE-4338
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4338
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.0
> Environment: Ubuntu Server 11.04, 
>Reporter: srikanth ayalasomayajulu
>  Labels: features, hadoop
> Fix For: 0.23.0
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Node manager daemons is not getting started on the slave machines. and giving 
> an error like stated below.
> 2012-06-12 19:05:56,172 FATAL nodemanager.NodeManager 
> (NodeManager.java:main(233)) - Error starting NodeManager
> org.apache.hadoop.yarn.YarnException: Failed to Start 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager
> at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
> Caused by: org.apache.avro.AvroRuntimeException: 
> java.lang.reflect.UndeclaredThrowableException
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
> at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
> ... 2 more
> Caused by: java.lang.reflect.UndeclaredThrowableException
> at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
> ... 3 more
> Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
> Call From mvm5/192.168.100.177 to mvm4:8025 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
> at $Proxy14.registerNodeManager(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
> ... 5 more
> Caused by: java.net.ConnectException: Call From mvm5/192.168.100.177 to 
> mvm4:8025 failed on connection exception: java.net.ConnectException: 
> Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:617)
> at org.apache.hadoop.ipc.Client.call(Client.java:1089)
> at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
> ... 7 more
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:419)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:460)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:557)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
> at org.apache.hadoop.ipc.Client.call(Client.java:1065)
> ... 8 more
> 2012-06-12 19:05:56,184 INFO  ipc.Server (Server.java:stop(1709)) - Stopping 
> server on 47645
> 2012-06-12 19:05:56,184 INFO  ipc.Server (Server.java:stop(1709)) - Stopping 
> server on 4344
> 2012-06-12 19:05:56,190 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stop(199)) - Stopping NodeManager metrics system...
> 2012-06-12 19:05:56,190 INFO  impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stopSources(408)) - Stopping metrics source JvmMetrics
> 2012-06-12 19:05:56,191 INFO  nodemanager.NodeManager 
> (StringUtils.java:run(605)) - SHUTDOWN_MSG:

-

[jira] [Updated] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory

2012-06-27 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4380:
-

  Component/s: nodemanager
   mrv2
 Priority: Minor  (was: Major)
Affects Version/s: 3.0.0
   2.0.0-alpha

> Empty Userlogs directory is getting created under logs directory
> 
>
> Key: MAPREDUCE-4380
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4380
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, nodemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Priority: Minor
>
> Empty Userlogs directory is getting created under logs directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory

2012-06-27 Thread Devaraj K (JIRA)

Devaraj K created MAPREDUCE-4380:


 Summary: Empty Userlogs directory is getting created under logs 
directory
 Key: MAPREDUCE-4380
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4380
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Devaraj K


Empty Userlogs directory is getting created under logs directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4369) Fix streaming job failures with WindowsResourceCalculatorPlugin

2012-06-27 Thread Ivan Mitic (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402792#comment-13402792
 ] 

Ivan Mitic commented on MAPREDUCE-4369:
---

Thanks for the change Bikas!

A few questions/suggestions:
1. In {{WindowsResourceCalculatorPlugin#getProcResourceValues()}} you mention 
that some tests use JVM_PID. Do you happen to have a list of these tests?
2. Can you please refactor 
{{ResourceCalculatorPlugin#getResourceCalculatorPlugin()}} to accept 
processPid, and update call sites to pass the appropriate value (I see only 3 
call sites). The cause of this bug in the first place is not having all call 
sites set the processPid accordingly. And then, if the passed-in processPid is 
null, you can fallback to {{System.getenv().get("JVM_PID")}}. Make sense? If 
I'm seeing things correctly, this way you might be able to clean up some of the 
newly introduced code.


> Fix streaming job failures with WindowsResourceCalculatorPlugin
> ---
>
> Key: MAPREDUCE-4369
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4369
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4369.branch-1-win.1.patch
>
>
> Some streaming jobs use local mode job runs that do not start tasks trackers. 
> In these cases, the jvm context is not setup and hence local mode execution 
> causes the code to crash.
> Fix is to not not use ResourceCalculatorPlugin in such cases or make the 
> local job run creating dummy jvm contexts. Choosing the first option because 
> thats the current implicit behavior in Linux. The ProcfsBasedProcessTree 
> (used inside the LinuxResourceCalculatorPlugin) does no real work when the 
> process pid is not setup correctly. This is what happens when local job mode 
> runs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows

2012-06-27 Thread Ivan Mitic (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402782#comment-13402782
 ] 

Ivan Mitic commented on MAPREDUCE-4374:
---

+1, change looks good to me. Agree on your points for using '%' and ';' on 
Windows.


> Fix child task environment variable config and add support for Windows
> --
>
> Key: MAPREDUCE-4374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1-win
>Reporter: Chuan Liu
>Assignee: Chuan Liu
>Priority: Minor
> Attachments: MAPREDUCE-4374-branch-1-win.patch
>
>
> In HADOOP-2838, a new feature was introduced to set environment variables via 
> the Hadoop config 'mapred.child.env' for child tasks. There are some further 
> fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix; 
> MAPREDUCE-478 broke the config into 'mapred.map.child.env' and 
> 'mapred.reduce.child.env'.  However the current implementation is still not 
> complete. It does not match its documentation or original intend as I 
> believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration 
> syntax, we will have problems using them on Windows because ‘:’ appears very 
> often in Windows path as in “C:\”, and environment variables are used very 
> often to hold path names. The Jira is created to fix the problem and provide 
> support on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-27 Thread Jie Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li resolved MAPREDUCE-4365.
---

  Resolution: Fixed
Target Version/s:   (was: 1.1.0)

One way is to include the profiler library into the job jar and use relative 
path like "../../foo.library" to locate it.

Thanks Deveraj, Sid, Vinod and everyone!

> Shipping Profiler Libraries by DistributedCache
> ---
>
> Key: MAPREDUCE-4365
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 1.0.3
>Reporter: Jie Li
>
> Hadoop profiling is great for performance tuning and debugging, but currently 
> we can only use Java built-in profilers such as HProf, and for other 
> profilers we need to install them on all slave nodes first, which is 
> inconvenient for large clusters and sometimes impossible for production 
> clusters. 
> Supporting shipping profiler libraries using DistributedCache will solve this 
> problem. For example, in mapred.task.profile.params, we specify a profiler 
> library from the DistributedCache using special place holders such as 
> , and Hadoop can look at the DistributedCache to replace  
> with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402691#comment-13402691
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

regarding changing updateStatus() to ensureFreshStatus(), no I think 
updateStatus() is more appropriate.


> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-27 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402641#comment-13402641
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

Alejandro, 

The API (Javadoc below) mentions that the job will be null, if there doesn't 
exist a job with that JobID. The old API also has the same functionality.

{code}
  /**
   * Validates if current user can view the job.
   * If user is not authorized to view the job, this method will modify the
   * response and forwards to an error page and returns Job with
   * viewJobAccess flag set to false.
   * @return JobWithViewAccessCheck object(contains JobInProgress object and
   * viewJobAccess flag). Callers of this method will check the flag
   * and decide if view should be allowed or not. Job will be null if
   * the job with given jobid doesnot exist at the JobTracker.
   */
{code}

> Job view ACL checks are too permissive
> --
>
> Key: MAPREDUCE-4317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.0.3
>Reporter: Harsh J
>Assignee: Karthik Kambatla
> Attachments: MR-4317.patch, MR-4317.patch
>
>
> The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
> the following internal member:
> {code}private boolean isViewAllowed = true;{code}
> Note that its true.
> Now, in the method that sets proper view-allowed rights, has:
> {code}
> if (user != null && job != null && jt.areACLsEnabled()) {
>   final UserGroupInformation ugi =
> UserGroupInformation.createRemoteUser(user);
>   try {
> ugi.doAs(new PrivilegedExceptionAction() {
>   public Void run() throws IOException, ServletException {
> // checks job view permission
> jt.getACLsManager().checkAccess(job, ugi,
> Operation.VIEW_JOB_DETAILS);
> return null;
>   }
> });
>   } catch (AccessControlException e) {
> String errMsg = "User " + ugi.getShortUserName() +
> " failed to view " + jobid + "!" + e.getMessage() +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   } catch (InterruptedException e) {
> String errMsg = " Interrupted while trying to access " + jobid +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   }
> }
> return myJob;
> {code}
> In the above snippet, you can notice that if user==null, which can happen if 
> user is not http-authenticated (as its got via request.getRemoteUser()), can 
> lead to the view being visible since the default is true and we didn't toggle 
> the view to false for user == null case.
> Ideally the default of the view job ACL must be false, or we need an else 
> clause that sets the view rights to false in case of a failure to find the 
> user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402640#comment-13402640
 ] 

Hadoop QA commented on MAPREDUCE-4355:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533723/MR-4355_mr1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2524//console

This message is automatically generated.

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr1.patch

Updated patch for MR1.

ensureFreshStatus() calls updateStatus() only after a particular amount of time 
has passed since previous updateStatus().

For getJobStatus(), to get the latest status, we need to call updateStatus(). 
Do you suggest calling ensureFreshStatus() instead for consistency?

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4373) Fix Javadoc warnings in JobClient.

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla resolved MAPREDUCE-4373.
-

  Resolution: Won't Fix
Release Note: The changes from MAPREDUCE-4355 have been reverted, and it 
doesn't suffer from the warnings anymore.

> Fix Javadoc warnings in JobClient.
> --
>
> Key: MAPREDUCE-4373
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4373
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha, 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Karthik Kambatla
>
> It looks like MAPREDUCE-4355 added in two new javadoc warnings.
> {code}
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:651:
>  warning - @param argument "jobid" is not a parameter name.
> [WARNING] 
> /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:669:
>  warning - @param argument "jobid" is not a parameter name.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402622#comment-13402622
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

The mr1 patch has a few false changes in the test class, please revert those.

Please add a simple testcase for the mr2 case.

Also, in the mr1 patch you are using 'updateStatus()' to update the jobstatus 
before returning the object. the method above uses 'ensureFreshStatus()', why 
the difference?

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402623#comment-13402623
 ] 

Hadoop QA commented on MAPREDUCE-4355:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533712/MR-4355_mr2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//console

This message is automatically generated.

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402607#comment-13402607
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4317:
---

Karthik,

Why 'job ==null' ?

{code}
+if (!jt.areACLsEnabled() || job == null) {
+  return myJob;
+}
{code}

If job == null then myJob is also null (or even the call may fail)

Shouldn't we check for job == null before trying to the myJob?



> Job view ACL checks are too permissive
> --
>
> Key: MAPREDUCE-4317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.0.3
>Reporter: Harsh J
>Assignee: Karthik Kambatla
> Attachments: MR-4317.patch, MR-4317.patch
>
>
> The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
> the following internal member:
> {code}private boolean isViewAllowed = true;{code}
> Note that its true.
> Now, in the method that sets proper view-allowed rights, has:
> {code}
> if (user != null && job != null && jt.areACLsEnabled()) {
>   final UserGroupInformation ugi =
> UserGroupInformation.createRemoteUser(user);
>   try {
> ugi.doAs(new PrivilegedExceptionAction() {
>   public Void run() throws IOException, ServletException {
> // checks job view permission
> jt.getACLsManager().checkAccess(job, ugi,
> Operation.VIEW_JOB_DETAILS);
> return null;
>   }
> });
>   } catch (AccessControlException e) {
> String errMsg = "User " + ugi.getShortUserName() +
> " failed to view " + jobid + "!" + e.getMessage() +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   } catch (InterruptedException e) {
> String errMsg = " Interrupted while trying to access " + jobid +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   }
> }
> return myJob;
> {code}
> In the above snippet, you can notice that if user==null, which can happen if 
> user is not http-authenticated (as its got via request.getRemoteUser()), can 
> lead to the view being visible since the default is true and we didn't toggle 
> the view to false for user == null case.
> Ideally the default of the view job ACL must be false, or we need an else 
> clause that sets the view rights to false in case of a failure to find the 
> user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Status: Patch Available  (was: Reopened)

Submitting the MR1 and MR2 patches.

- No tests for MR2 - just added a wrapper call to Job.getStatus()

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 2.0.0-alpha, 1.0.3
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: (was: MR-4355_mr2.patch)

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: (was: MR-4355_mr1.patch)

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr2.patch

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402576#comment-13402576
 ] 

Bikas Saha commented on MAPREDUCE-4322:
---

Thanks for including all comments! +1. lgtm.

> Fix command-line length abort issues on Windows
> ---
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Windows, downstream applications with long aggregate 
> classpaths
>Reporter: John Gordon
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win(2).patch, 
> MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, 
> MAPREDUCE-4322-branch-1-win(5).patch, MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to 
> invoke java and runs that batch.  Within the batch file, the invocation of 
> Java currently has -classpath ${CLASSPATH} inline to the command.  That line 
> often exceeds 8000 characters.  This is ok for most linux distributions 
> because the line limit env variable is often set much higher than this.  
> However, for Windows this cause cmd to abort execution.  This surfaces in 
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the 
> -classpath option into a config file to take the longest variable part of the 
> line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Description: 
Usecase: Read the start/end-time of a particular job.

Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate 
through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold 
the job's start time.

Adding RunningJob.getJobStatus() solves the issue.

  was:
To read the start-time of a particular job, one should not need to getAllJobs() 
and iterate through them.

getJob(JobID) returns RunningJob, which doesn't hold the job's start time.

Hence, we need to either add getJobStatus(JobID) to the API or add startTime to 
RunningJob. Doing the latter.


Summary: Add RunningJob.getJobStatus()  (was: Add startTime to 
RunningJob)

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()

2012-06-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4355:


Attachment: MR-4355_mr1.patch

> Add RunningJob.getJobStatus()
> -
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> Usecase: Read the start/end-time of a particular job.
> Currently, one has to iterate through JobClient.getAllJobStatuses() and 
> iterate through them. JobClient.getJob(JobID) returns RunningJob, which 
> doesn't hold the job's start time.
> Adding RunningJob.getJobStatus() solves the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402522#comment-13402522
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

@Arun, 

I'm working with Ahmed on this one. 

The use case we have is large clusters running 1000+ concurrent jobs, 
monitoring agents are querying the cluster for jobs in different statuses, most 
of the times this agents focus on running/just finished jobs. With the current 
API we are forced to query ALL jobs, including retired jobs (which increases 
significantly the number of jobs being returned), and do the filtering in the 
client side. This creates unnecessary load on the JT (serializing all jobs) and 
on the client (deserializing all jobs). Thus adding this new API, which does 
not break backwards compatibility will definitely help reducing this load. 

Regarding the support in MRv2, we currently have a the getAllJobs() method 
there as well, we can address it in the client side for sure (the fallback 
implementation Ahmed did in the client for MRv1). We could add and PB call to 
support the filtering on the RM side. While looking at MRv2 code I've noticed 
we are only querying the RM, this means that completed jobs will never be 
returned by this call. If I'm correct here, a solution would be for the client 
to call the HS to ask for jobs younger than X; this would be the equivalent of 
'retired' jobs, and definitely the filtering would be useful as well for the 
same reasons explained above.
 

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402516#comment-13402516
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2415 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2415/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Ivan Mitic (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-4322:
--

Attachment: MAPREDUCE-4322-branch-1-win(5).patch

Attaching updated patch. Adding explicit checks that the correct exception 
string is returned back. Also removing some of if WINDOWS forks in the test 
code.

> Fix command-line length abort issues on Windows
> ---
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Windows, downstream applications with long aggregate 
> classpaths
>Reporter: John Gordon
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win(2).patch, 
> MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, 
> MAPREDUCE-4322-branch-1-win(5).patch, MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to 
> invoke java and runs that batch.  Within the batch file, the invocation of 
> Java currently has -classpath ${CLASSPATH} inline to the command.  That line 
> often exceeds 8000 characters.  This is ok for most linux distributions 
> because the line limit env variable is often set much higher than this.  
> However, for Windows this cause cmd to abort execution.  This surfaces in 
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the 
> -classpath option into a config file to take the longest variable part of the 
> line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-27 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402503#comment-13402503
 ] 

Tom White commented on MAPREDUCE-3837:
--

Mayank - thanks for the changes. Here's my feedback:

* If there is no need for restart count anymore - since jobs are re-run from 
the beginning each time - then would it be cleaner to remove it entirely?
* In JobTracker you changed "shouldRecover = false;" to "shouldRecover = true;" 
without updating the comment on the line before. (This might be related to the 
previous point about not having restart counts.)
* Remove the @Ignore annotation from TestRecoveryManager and the comment about 
MAPREDUCE-873.
* The new test testJobresubmission (should be testJobResubmission) should test 
that the job succeeded after the restart. Also, there's no reason to run it as 
a high-priority job.
* There's a comment saying it is a "faulty job" - which it isn't.
* Have setUp and tearDown methods to start and stop the cluster. At the moment 
there is code duplication, and clusters won't be shut down cleanly on failure.
* testJobTracker would be better named testJobTrackerRestartsWithMissingJobFile
* testRecoveryManager would be better named testJobTrackerRestartWithBadJobs
* There are multiple typos and formatting errors (including indentation, which 
should be 2 spaces) in the new code. See Konstantin's comment above.
* TestJobTrackerRestartWithLostTracker still fails, as does 
TestJobTrackerSafeMode. These should be fixed as a part of this work.


> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> 
>
> Key: MAPREDUCE-3837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Fix For: 0.24.0, 0.22.1, 0.23.2
>
> Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch, 
> PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out

2012-06-27 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402499#comment-13402499
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

There is a check for null to handle transitions from UNASSIGNED state, but the 
check doesn't work anymore because  assignedRequest.get() throws NPE after the 
following change from MAPREDUCE-3921.  

{noformat}
 ContainerId get(TaskAttemptId tId) {
   if (tId.getTaskId().getTaskType().equals(TaskType.MAP)) {
-return maps.get(tId);
+return maps.get(tId).getId();
   } else {
-return reduces.get(tId);
+return reduces.get(tId).getId();
   }
 }
{noformat}

Jason has also suggested we put a time limit in these jobs so that they don't 
hang even if something goes wrong.

> TestClusterMRNotification times out
> ---
>
> Key: MAPREDUCE-4376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Kihwal Lee
>
> The TestClusterMRNotification test is often timing out.  git bisect tests 
> narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
> that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402489#comment-13402489
 ] 

Bikas Saha commented on MAPREDUCE-4322:
---

Thats exactly what I am saying too :) The test is trying to cover both cases, 
but the result is kind of implicit right now because we know both paths are 
being covered. However, in the test itself by checking for only sb.toString() 
we are not making that explicit. There is nothing to hardcode. Unless I am 
reading the test code incorrectly, we have already defined List setup 
and List cmd. In the exception message, along with checking for 
sb.toString(), we could also check for setup[0] and cmd[0]. That way its 
explicit that 2 different paths are being covered.

> Fix command-line length abort issues on Windows
> ---
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Windows, downstream applications with long aggregate 
> classpaths
>Reporter: John Gordon
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win(2).patch, 
> MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, 
> MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to 
> invoke java and runs that batch.  Within the batch file, the invocation of 
> Java currently has -classpath ${CLASSPATH} inline to the command.  That line 
> often exceeds 8000 characters.  This is ok for most linux distributions 
> because the line limit env variable is often set much higher than this.  
> However, for Windows this cause cmd to abort execution.  This surfaces in 
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the 
> -classpath option into a config file to take the longest variable part of the 
> line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4376) TestClusterMRNotification times out

2012-06-27 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned MAPREDUCE-4376:
-

Assignee: Kihwal Lee

> TestClusterMRNotification times out
> ---
>
> Key: MAPREDUCE-4376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 2.0.1-alpha
>Reporter: Jason Lowe
>Assignee: Kihwal Lee
>
> The TestClusterMRNotification test is often timing out.  git bisect tests 
> narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
> that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue

2012-06-27 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4360:
-

Attachment: MAPREDUCE-4360-22-1.patch

Thanks Konst for your comments.

Updated the patch with formatting issues.

Thanks,
Mayank

> Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
> container queue
> -
>
> Key: MAPREDUCE-4360
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.1
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4360-22-1.patch, MAPREDUCE-4360-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue

2012-06-27 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4360:
-

Affects Version/s: (was: trunk)

> Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
> container queue
> -
>
> Key: MAPREDUCE-4360
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.1
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4360-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402478#comment-13402478
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

To be clear: we should refrain from adding public apis without a *compelling* 
use-case to MRv1, particularly when they are going to be hard to support in 
MRv2. Thanks.

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402479#comment-13402479
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Common-trunk-Commit #2396 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2396/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402477#comment-13402477
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

bq. As I highlighted in the ticket description above: The JobClient only 
exposes a getAllJobs() which returns all submitted jobs in any state, the 
result also includes all retired jobs. This list is long and represents an 
unneeded overhead especially in the case of clients only interested in jobs in 
specific states.

Ahmed I'm not convinced. Yes, it's bit more overhead, but I don't see how 
adding a new public api is going to make significant difference. IAC, if you 
set completed jobs to 0, you'll not get retired jobs. Unless I hear a more 
compelling argument I'm -1 on this. Also, please remember that this API is 
fairly hard to support with YARN, so that is another problem.

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402469#comment-13402469
 ] 

Hudson commented on MAPREDUCE-4346:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2465 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2465/])
Reverting MAPREDUCE-4346 r1353757 (Revision 1354656)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354656
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402464#comment-13402464
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

Ahmed, LGTM, only thing is that status is an int and you are using a set to do 
the filtering, this means that for each comparison an Integer will be created. 
Instead I'd just iterate over the received filter using a helper method 
*boolean filter(int filter[], int status)*.


> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob

2012-06-27 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402465#comment-13402465
 ] 

Arun C Murthy commented on MAPREDUCE-4355:
--

bq. Arun, it might be cleaner to add RunningJob.getJobStatus() instead of 
adding startTime, endTime fields to RunningJob and redundantly maintaining them.

+1, good point!

> Add startTime to RunningJob
> ---
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> To read the start-time of a particular job, one should not need to 
> getAllJobs() and iterate through them.
> getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
> Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
> to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2012-06-27 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4342:
-

Attachment: MAPREDUCE-4342-22-3.patch

Hi Konst,

Thanks for the comments, updated all the comments.

Thanks,
Mayank

> Distributed Cache gives inconsistent result if cache files get deleted from 
> task tracker 
> -
>
> Key: MAPREDUCE-4342
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0, 1.0.3, trunk
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22-2.patch, 
> MAPREDUCE-4342-22-3.patch, MAPREDUCE-4342-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (MAPREDUCE-4355) Add startTime to RunningJob

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402450#comment-13402450
 ] 

Alejandro Abdelnur edited comment on MAPREDUCE-4355 at 6/27/12 6:45 PM:


reverted from trunk, branch-2 and branch-1.

  was (Author: tucu00):
reverted from trunk and branch-2
  
> Add startTime to RunningJob
> ---
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> To read the start-time of a particular job, one should not need to 
> getAllJobs() and iterate through them.
> getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
> Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
> to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob

2012-06-27 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402450#comment-13402450
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

reverted from trunk and branch-2

> Add startTime to RunningJob
> ---
>
> Key: MAPREDUCE-4355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 1.1.0, 2.0.1-alpha
>
> Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch
>
>
> To read the start-time of a particular job, one should not need to 
> getAllJobs() and iterate through them.
> getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
> Hence, we need to either add getJobStatus(JobID) to the API or add startTime 
> to RunningJob. Doing the latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces

2012-06-27 Thread John Gordon (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402452#comment-13402452
 ] 

John Gordon commented on MAPREDUCE-4377:


Thanks Robert!  I agree it won't be an easy fix and may need some 
rearchitecture and significant test additions.

> TaskRunner javaopts parsing doesn't handle embedded spaces
> --
>
> Key: MAPREDUCE-4377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task-controller
>Affects Versions: trunk
> Environment: java options containing escaped or non-escaped embedded 
> spaces.
>Reporter: John Gordon
>
> TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string, 
> then split is on ' ' and tries to reason on individual options from there.  
> The problem with this approach is that java options may contain embedded 
> spaces in many legitimate cases -- this means it is reasoning on incomplete 
> option strings and cannot do appropriate preprocessing to do things like 
> handle escape characters or matched quotation marks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Ivan Mitic (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402444#comment-13402444
 ] 

Ivan Mitic commented on MAPREDUCE-4322:
---

3. Oh, thanks for clarifying. My thinking was, from the user's perspective, we 
are outputting the actual command that exceeded the limit. Whether it is setup 
or command, it is not as relevant. In unit tests, since I know the code, I want 
to cover all cases, so I'm testing both. I am leaning toward keeping the code 
as is, given that I wouldn't want to have a hardcoded dependency on what is in 
the exception message. Let me know if you feel strong about this.

> Fix command-line length abort issues on Windows
> ---
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Windows, downstream applications with long aggregate 
> classpaths
>Reporter: John Gordon
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win(2).patch, 
> MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, 
> MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to 
> invoke java and runs that batch.  Within the batch file, the invocation of 
> Java currently has -classpath ${CLASSPATH} inline to the command.  That line 
> often exceeds 8000 characters.  This is ok for most linux distributions 
> because the line limit env variable is often set much higher than this.  
> However, for Windows this cause cmd to abort execution.  This surfaces in 
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the 
> -classpath option into a config file to take the longest variable part of the 
> line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue

2012-06-27 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402426#comment-13402426
 ] 

Mayank Bansal commented on MAPREDUCE-4360:
--

Jason,

I did not realize that it is already fixed in trunk will update the JIRA. 
Thanks for pointing this out.


Konst,

Thats already been done in when tasks been assigned any queue.

Thanks,
Mayank

> Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
> container queue
> -
>
> Key: MAPREDUCE-4360
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.1, trunk
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4360-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

2012-06-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402419#comment-13402419
 ] 

Bikas Saha commented on MAPREDUCE-4322:
---

3. My main concern is that we are not differentiating that the first failure is 
due to a bad setup string while the second one is due to a bad cmd string. 
Since the code is adding the exact failed command into the exception we could 
look for "setup" in the first case and "command" in the second case in addition 
to sb.toString(). I should have been more clear. I didn't literally mean 
"setup.toString()" because its a list :)

> Fix command-line length abort issues on Windows
> ---
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: Windows, downstream applications with long aggregate 
> classpaths
>Reporter: John Gordon
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win(2).patch, 
> MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, 
> MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to 
> invoke java and runs that batch.  Within the batch file, the invocation of 
> Java currently has -classpath ${CLASSPATH} inline to the command.  That line 
> often exceeds 8000 characters.  This is ok for most linux distributions 
> because the line limit env variable is often set much higher than this.  
> However, for Windows this cause cmd to abort execution.  This surfaces in 
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the 
> -classpath option into a config file to take the longest variable part of the 
> line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out

2012-06-27 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402359#comment-13402359
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

Relevant log entries:

{noformat}
2012-06-27 08:48:55,331 INFO [IPC Server handler 0 on 57856] 
org.apache.hadoop.mapreduce.v2.app.client.MRClie
ntService: Kill Job received from client job_1340812108963_0002
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobI
mpl: job_1340812108963_0002Job Transitioned from RUNNING to KILL_WAIT
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_m_00 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_m_01 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,333 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
Impl: task_1340812108963_0002_r_00 Task Transitioned from SCHEDULED to 
KILL_WAIT
2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.Task
AttemptImpl: attempt_1340812108963_0002_m_00_0 TaskAttempt Transitioned 
from UNASSIGNED to KILLED
2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1340812108963_0002_m_01_0 TaskAttempt Transitioned from UNASSIGNED 
to KILLED
2012-06-27 08:48:55,335 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1340812108963_0002_r_00_0 TaskAttempt Transitioned from UNASSIGNED 
to KILLED
2012-06-27 08:48:55,335 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_m_00 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,338 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_m_01 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,338 INFO [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the 
event EventType: CONTAINER_DEALLOCATE
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1340812108963_0002_r_00 Task Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 2
2012-06-27 08:48:55,340 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 3
2012-06-27 08:48:55,341 ERROR [Thread-45] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error in handling 
event type CONTAINER_DEALLOCATE to the ContainreAllocator
java.lang.NullPointerException
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$AssignedRequests.get(RMContainerAllocator.java:1103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleEvent(RMContainerAllocator.java:339)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$1.run(RMContainerAllocator.java:191)
2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job 
Transitioned from KILL_WAIT to KILLED
2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job 
Transitioned from KILLED to ERROR
{noformat}

The code assumes that if the attempt ID is not found in scheduledRequests, it 
will be in assignedRequests. But in this case, it was still in UNASSIGNED.

> TestClusterMRNotification times out
> ---
>
> Key: MAPREDUCE-4376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 2.0.1-alpha
>Reporter: Jason Lowe
>
> The TestClusterMRNotification test is often timing out.  git bisect tests 
> narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
> that change and times out most of the

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402328#comment-13402328
 ] 

Hadoop QA commented on MAPREDUCE-4371:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12533666/MAPREDUCE-4371-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//console

This message is automatically generated.

> Check for cyclic dependencies in Jobcontrol job DAG
> ---
>
> Key: MAPREDUCE-4371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.0.0
>Reporter: madhukara phatak
> Attachments: MAPREDUCE-4371-1.patch, MAPREDUCE-4371.patch
>
>
> In current implementation of JobControl, whenever there is a cyclic 
> dependency between the jobs it throws a Stack overflow exception. This jira 
> adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-27 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402323#comment-13402323
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4326:
---

Bikas,

What's going on? I can help you if you have a difficulty related to a 
preliminary design sketch.

> Resurrect RM Restart 
> -
>
> Key: MAPREDUCE-4326
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Bikas Saha
> Attachments: MR-4343.1.patch
>
>
> We should resurrect 'RM Restart' which we disabled sometime during the RM 
> refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out

2012-06-27 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402319#comment-13402319
 ] 

Kihwal Lee commented on MAPREDUCE-4376:
---

It used to be

job 1, SUCCEEDED, SUCCEEDED
job 2, KILLED, KILLED
job 3, FAILED, FAILED

Now it's getting

job 1, SUCCEEDED, SUCCEEDED
job 2, ERROR, ERROR

The test hangs after job 2. 


> TestClusterMRNotification times out
> ---
>
> Key: MAPREDUCE-4376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, test
>Affects Versions: 2.0.1-alpha
>Reporter: Jason Lowe
>
> The TestClusterMRNotification test is often timing out.  git bisect tests 
> narrowed it down to MAPREDUCE-3921, as the test consistently passes before 
> that change and times out most of the time after picking up that change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread madhukara phatak (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated MAPREDUCE-4371:


Attachment: MAPREDUCE-4371-1.patch

Updated the patch to fix test case and style issues.

> Check for cyclic dependencies in Jobcontrol job DAG
> ---
>
> Key: MAPREDUCE-4371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.0.0
>Reporter: madhukara phatak
> Attachments: MAPREDUCE-4371-1.patch, MAPREDUCE-4371.patch
>
>
> In current implementation of JobControl, whenever there is a cyclic 
> dependency between the jobs it throws a Stack overflow exception. This jira 
> adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402304#comment-13402304
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2412 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2412/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-27 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402297#comment-13402297
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4326:
---

Sharad, 

MAPREDUCE-2713 is now marked as dup of this ticket(MAPREDUCE-4326).

> Resurrect RM Restart 
> -
>
> Key: MAPREDUCE-4326
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Bikas Saha
> Attachments: MR-4343.1.patch
>
>
> We should resurrect 'RM Restart' which we disabled sometime during the RM 
> refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402282#comment-13402282
 ] 

Hadoop QA commented on MAPREDUCE-4371:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533455/MAPREDUCE-4371.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//console

This message is automatically generated.

> Check for cyclic dependencies in Jobcontrol job DAG
> ---
>
> Key: MAPREDUCE-4371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.0.0
>Reporter: madhukara phatak
> Attachments: MAPREDUCE-4371.patch
>
>
> In current implementation of JobControl, whenever there is a cyclic 
> dependency between the jobs it throws a Stack overflow exception. This jira 
> adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue

2012-06-27 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402271#comment-13402271
 ] 

Jason Lowe commented on MAPREDUCE-4360:
---

This JIRA indicates that trunk is affected, but I believe this has already been 
addressed in trunk (and branch-2 and branch-0.23) by MAPREDUCE-3683.

> Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of 
> container queue
> -
>
> Key: MAPREDUCE-4360
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.1, trunk
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4360-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402268#comment-13402268
 ] 

Robert Joseph Evans commented on MAPREDUCE-4371:


Just a few comments about the patch.

# the new file needs an apache license comment at the top.
# It would be nice to have a comment in the test about what the test class is 
intended to cover.
# The test looks like it is passing, but without any exceptions ever being 
caught in the test.  The run method catches all exceptions and then kills all 
of the jobs.  This is because run is intended to potentially be called on its 
own thread.  Please instead verify that all of the jobs are marked as failed at 
the end.
# Inside the patch itself it looks like there are a few places where the 
formatting is off. We use 2 spaces for indentation and try to wrap the lines at 
under 80 characters.


Other then that it looks good.  Also a bit of process in when you upload a 
patch please mark the box indicating that it is intended for inclusion in 
Apache, also please then hit the submit patch button.  This will trigger 
Jenkins to try and test the patch against trunk.  I am going to hit submit 
patch for you, but the checkbox you have to do because it is your code and your 
copyright.

> Check for cyclic dependencies in Jobcontrol job DAG
> ---
>
> Key: MAPREDUCE-4371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.0.0
>Reporter: madhukara phatak
> Attachments: MAPREDUCE-4371.patch
>
>
> In current implementation of JobControl, whenever there is a cyclic 
> dependency between the jobs it throws a Stack overflow exception. This jira 
> adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG

2012-06-27 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4371:
---

Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

> Check for cyclic dependencies in Jobcontrol job DAG
> ---
>
> Key: MAPREDUCE-4371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.0.0
>Reporter: madhukara phatak
> Attachments: MAPREDUCE-4371.patch
>
>
> In current implementation of JobControl, whenever there is a cyclic 
> dependency between the jobs it throws a Stack overflow exception. This jira 
> adds a cyclic check to jobcontrol.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402264#comment-13402264
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Common-trunk-Commit #2393 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2393/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402262#comment-13402262
 ] 

Hudson commented on MAPREDUCE-4372:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2462 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2462/])
MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) 
(Revision 1354531)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354531
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4372:
---

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha
   Status: Resolved  (was: Patch Available)

Thanks Devaraj,

I put this into trunk and branch-2.

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402255#comment-13402255
 ] 

Robert Joseph Evans commented on MAPREDUCE-4372:


Changes look good to me +1.

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402252#comment-13402252
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Mapreduce-trunk #1122 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1122/])
MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working 
properly (Jason Lowe via bobby) (Revision 1354181)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354181
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


> mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
> the scheduling of the reduce tasks
> 
>
> Key: MAPREDUCE-4228
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
> MAPREDUCE-4228.patch
>
>
> If no more map tasks need to be scheduled but not all have completed, the 
> ApplicationMaster will start scheduling reducers even if the number of 
> completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
> threshold.  For example, if the property is set to 1.0 all maps should 
> complete before any reducers are scheduled.  However the reducers are 
> scheduled as soon as the last map task is assigned to a container.  For a job 
> with very long-running maps, a cluster with enough capacity to launch all map 
> tasks could cause reducers to launch prematurely and waste cluster resources.
> Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts

2012-06-27 Thread Robert Joseph Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4379:
---

 Target Version/s: 0.23.3
Affects Version/s: 0.23.3

I really would like to see this go into 0.23 as well.

> Node Manager throws java.lang.OutOfMemoryError: Java heap space due to 
> org.apache.hadoop.fs.LocalDirAllocator.contexts
> --
>
> Key: MAPREDUCE-4379
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, nodemanager
>Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
>Priority: Critical
>
> {code:xml}
> Exception in thread "Container Monitor" java.lang.OutOfMemoryError: Java heap 
> space
>   at java.io.BufferedReader.(BufferedReader.java:80)
>   at java.io.BufferedReader.(BufferedReader.java:91)
>   at 
> org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
>   at 
> org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
>   Exception in thread "LocalizerRunner for 
> container_1340690914008_10890_01_03" java.lang.OutOfMemoryError: Java 
> heap space
>   at java.util.Arrays.copyOfRange(Arrays.java:3209)
>   at java.lang.String.(String.java:215)
>   at 
> com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
>   at 
> com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
>   at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
>   at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
>   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
>   at 
> org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts

2012-06-27 Thread Devaraj K (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402244#comment-13402244
 ] 

Devaraj K commented on MAPREDUCE-4379:
--

{code:title=ContainerLocalizer.java|borderStyle=solid}
this.appDirs =
  new LocalDirAllocator(String.format(APPCACHE_CTXT_FMT, appId));
this.userDirs =
  new LocalDirAllocator(String.format(USERCACHE_CTXT_FMT, appId));
this.pendingResources = new HashMap>();
{code}

Here for every application during localization, it creates two 
LocalDirAllocator instances.


{code:title=LocalDirAllocator.java|borderStyle=solid}
  private AllocatorPerContext obtainContext(String contextCfgItemName) {
synchronized (contexts) {
  AllocatorPerContext l = contexts.get(contextCfgItemName);
  if (l == null) {
contexts.put(contextCfgItemName, 
(l = new AllocatorPerContext(contextCfgItemName)));
  }
  return l;
}
  }
{code}

 Those two instances will internally creates AllocatorPerContext instances and 
add those into contexts while obtaining contexts. It will keep on adding for 
every application and no where else these are getting removed from the map. It 
is leading to OOM after running for some time.

> Node Manager throws java.lang.OutOfMemoryError: Java heap space due to 
> org.apache.hadoop.fs.LocalDirAllocator.contexts
> --
>
> Key: MAPREDUCE-4379
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, nodemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
>Priority: Critical
>
> {code:xml}
> Exception in thread "Container Monitor" java.lang.OutOfMemoryError: Java heap 
> space
>   at java.io.BufferedReader.(BufferedReader.java:80)
>   at java.io.BufferedReader.(BufferedReader.java:91)
>   at 
> org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
>   at 
> org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
>   Exception in thread "LocalizerRunner for 
> container_1340690914008_10890_01_03" java.lang.OutOfMemoryError: Java 
> heap space
>   at java.util.Arrays.copyOfRange(Arrays.java:3209)
>   at java.lang.String.(String.java:215)
>   at 
> com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
>   at 
> com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
>   at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
>   at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
>   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
>   at 
> org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA a

[jira] [Created] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts

2012-06-27 Thread Devaraj K (JIRA)

Devaraj K created MAPREDUCE-4379:


 Summary: Node Manager throws java.lang.OutOfMemoryError: Java heap 
space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
 Key: MAPREDUCE-4379
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical


{code:xml}
Exception in thread "Container Monitor" java.lang.OutOfMemoryError: Java heap 
space
at java.io.BufferedReader.(BufferedReader.java:80)
at java.io.BufferedReader.(BufferedReader.java:91)
at 
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410)
at 
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389)
Exception in thread "LocalizerRunner for 
container_1340690914008_10890_01_03" java.lang.OutOfMemoryError: Java heap 
space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.(String.java:215)
at 
com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185)
at 
com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at 
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689)
at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:722)
at 
org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces

2012-06-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402230#comment-13402230
 ] 

Robert Joseph Evans commented on MAPREDUCE-4377:


John,

that is very true, and if you can fix it I would be very happy to commit it for 
you.  However, I don't think this is the only place in the code that has 
problems with embedded spaces.  I'm not saying that we should not fix it, we 
should, just be aware that there be monsters here.  Also be aware that there 
may be some Windows vs. POSIX(bash) issues that you may run into with trying to 
parse the arguments.  Hopefully not too much though.

> TaskRunner javaopts parsing doesn't handle embedded spaces
> --
>
> Key: MAPREDUCE-4377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task-controller
>Affects Versions: trunk
> Environment: java options containing escaped or non-escaped embedded 
> spaces.
>Reporter: John Gordon
>
> TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string, 
> then split is on ' ' and tries to reason on individual options from there.  
> The problem with this approach is that java options may contain embedded 
> spaces in many legitimate cases -- this means it is reasoning on incomplete 
> option strings and cannot do appropriate preprocessing to do things like 
> handle escape characters or matched quotation marks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

2012-06-27 Thread Avner BenHanoch (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avner BenHanoch updated MAPREDUCE-4049:
---

Attachment: HADOOP-1.x.y-review-oriented.patch

This patch replaces all my previous patches.  It is written in order to ease 
code review, by doing just the minimal changes in existing code.   *I believe 
anyone can verify this patch at glance!*

(my old patches included design enhancements by moving plugins' shared code out 
of ReduceCopier into plugins' base class, and by making ReduceCopier a 
standalone class instead of being inner class of ReduceTask).


> plugin for generic shuffle service
> --
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: performance, task, tasktracker
>Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>Reporter: Avner BenHanoch
>  Labels: merge, plugin, rdma, shuffle
> Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, 
> HADOOP-1.1.patch, HADOOP-1.x.y-review-oriented.patch, Hadoop Shuffle Consumer 
> Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402198#comment-13402198
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Hdfs-0.23-Build #299 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/299/])
svn merge -c 1354181 FIXES: MAPREDUCE-4228. 
mapreduce.job.reduce.slowstart.completedmaps is not working properly (Jason 
Lowe via bobby) (Revision 1354185)

 Result = UNSTABLE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354185
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


> mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
> the scheduling of the reduce tasks
> 
>
> Key: MAPREDUCE-4228
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
> MAPREDUCE-4228.patch
>
>
> If no more map tasks need to be scheduled but not all have completed, the 
> ApplicationMaster will start scheduling reducers even if the number of 
> completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
> threshold.  For example, if the property is set to 1.0 all maps should 
> complete before any reducers are scheduled.  However the reducers are 
> scheduled as soon as the last map task is assigned to a container.  For a job 
> with very long-running maps, a cluster with enough capacity to launch all map 
> tasks could cause reducers to launch prematurely and waste cluster resources.
> Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4378) hadoop-validate-setup.sh fails to execute kinit command in secure mode

2012-06-27 Thread Nishan Shetty (JIRA)

Nishan Shetty created MAPREDUCE-4378:


 Summary: hadoop-validate-setup.sh fails to execute kinit command 
in secure mode
 Key: MAPREDUCE-4378
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4378
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.1-alpha, 3.0.0
 Environment: SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1
Reporter: Nishan Shetty


hadoop-validate-setup.sh is refering to the invalid kinit location.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks

2012-06-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402155#comment-13402155
 ] 

Hudson commented on MAPREDUCE-4228:
---

Integrated in Hadoop-Hdfs-trunk #1089 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1089/])
MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working 
properly (Jason Lowe via bobby) (Revision 1354181)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1354181
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java


> mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay 
> the scheduling of the reduce tasks
> 
>
> Key: MAPREDUCE-4228
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 0.23.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 0.23.3, 2.0.1-alpha, 3.0.0
>
> Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, 
> MAPREDUCE-4228.patch
>
>
> If no more map tasks need to be scheduled but not all have completed, the 
> ApplicationMaster will start scheduling reducers even if the number of 
> completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps 
> threshold.  For example, if the property is set to 1.0 all maps should 
> complete before any reducers are scheduled.  However the reducers are 
> scheduled as soon as the last map task is assigned to a container.  For a job 
> with very long-running maps, a cluster with enough capacity to launch all map 
> tasks could cause reducers to launch prematurely and waste cluster resources.
> Thanks to Phil Su for discovering this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402126#comment-13402126
 ] 

Hadoop QA commented on MAPREDUCE-4372:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12533636/MAPREDUCE-4372-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//console

This message is automatically generated.

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Status: Patch Available  (was: Open)

Thanks a lot Robert for looking into the patch. I have updated the patch as per 
your suggestion.

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Status: Open  (was: Patch Available)

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager

2012-06-27 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4372:
-

Attachment: MAPREDUCE-4372-1.patch

> Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor 
> and Shutdown hook manager
> --
>
> Key: MAPREDUCE-4372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, resourcemanager
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, 
> rm-threaddump.out
>
>
> Please find the attached resource manager thread dump for the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Ahmed Radwan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402047#comment-13402047
 ] 

Ahmed Radwan commented on MAPREDUCE-4346:
-

As I highlighted in the ticket description above: The JobClient only exposes a 
getAllJobs() which returns all submitted jobs in any state, the result also 
includes all retired jobs. This list is long and represents an unneeded 
overhead especially in the case of clients only interested in jobs in specific 
states. 

One use case is a monitoring service that uses the JobClient and periodically 
calls getAllJobs() to keep track of submitted jobs. Just using the current 
getAllJobs() will represent a communication overhead because the returned list 
is unnecessarily long with redundant information (when called periodically).

The new api provides a way for clients to selectively filter the long list 
which is normally returned by getAllJobs(). The Client can now specify as part 
of the call: the job statuses of interest and if including retired jobs is 
desired or not.

What do you think Arun?

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402013#comment-13402013
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

Asking again, what is the use case? I really don't like the api... particularly 
since it's a public api.

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402006#comment-13402006
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12533607/MAPREDUCE-4346_rev4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2519//console

This message is automatically generated.

> Adding a refined version of JobTracker.getAllJobs() and exposing through the 
> JobClient
> --
>
> Key: MAPREDUCE-4346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
> MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch
>
>
> The current implementation for JobTracker.getAllJobs() returns all submitted 
> jobs in any state, in addition to retired jobs. This list can be long and 
> represents an unneeded overhead especially in the case of clients only 
> interested in jobs in specific state(s). 
> It is beneficial to include a refined version where only jobs having specific 
> statuses are returned and retired jobs are optional to include. 
> I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

74 matches

Mail list logo