[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running

2012-06-07 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291555#comment-13291555
 ] 

Devaraj K commented on MAPREDUCE-4288:
--

It is coming due to the hard-coded values in the below code,

{code:title=ResourceMgrDelegate.java|borderStyle=solid}
  public ClusterMetrics getClusterMetrics() throws IOException,
  InterruptedException {
GetClusterMetricsRequest request = 
recordFactory.newRecordInstance(GetClusterMetricsRequest.class);
GetClusterMetricsResponse response = 
applicationsManager.getClusterMetrics(request);
YarnClusterMetrics metrics = response.getClusterMetrics();
ClusterMetrics oldMetrics = new ClusterMetrics(1, 1, 1, 1, 1, 1, 
metrics.getNumNodeManagers() * 10, metrics.getNumNodeManagers() * 2, 1,
metrics.getNumNodeManagers(), 0, 0);
return oldMetrics;
  }
{code} 

Here we cannot get runningMaps, runningReduces, occupiedMapSlots...etc from RM 
because the yarn cluster is completely based on the resources and resource 
usages.


It doesn't look good to show these hard-coded values always to the user when 
they try to get cluster status using the JobClient.getClusterStatus() API.

Any thoughts on this?

> ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one 
> when no job is running
> ---
>
> Key: MAPREDUCE-4288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Nishan Shetty
>
> When no job is running in the cluster invoke the ClusterStatus.getMapTasks() 
> and ClusterStatus.getReduceTasks() API's
> Observed that these API's are returning one instead of zero(as no job is 
> running)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4327) Enhance CS to schedule accounting for both memory and cpu cores

2012-06-07 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291553#comment-13291553
 ] 

Arun C Murthy commented on MAPREDUCE-4327:
--

An option to consider for multi-resource scheduling is the approach outlined by 
Ghodsi et al in the DRF paper: 
http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-18.pdf


> Enhance CS to schedule accounting for both memory and cpu cores
> ---
>
> Key: MAPREDUCE-4327
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2, resourcemanager, scheduler
>Affects Versions: 2.0.0-alpha
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> With YARN being a general purpose system, it would be useful for several 
> applications (MPI et al) to specify not just memory but also CPU (cores) for 
> their resource requirements. Thus, it would be useful to the 
> CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4327) Enhance CS to schedule accounting for both memory and cpu cores

2012-06-07 Thread Arun C Murthy (JIRA)
Arun C Murthy created MAPREDUCE-4327:


 Summary: Enhance CS to schedule accounting for both memory and cpu 
cores
 Key: MAPREDUCE-4327
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2, resourcemanager, scheduler
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Arun C Murthy


With YARN being a general purpose system, it would be useful for several 
applications (MPI et al) to specify not just memory but also CPU (cores) for 
their resource requirements. Thus, it would be useful to the CapacityScheduler 
to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.

2012-06-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291550#comment-13291550
 ] 

Hudson commented on MAPREDUCE-4318:
---

Integrated in Hadoop-Mapreduce-22-branch #105 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/105/])
MAPREDUCE-4318. TestRecoveryManager should not use raw configuration keys. 
Contributed by Benoy Antony. (Revision 1347853)

 Result = FAILURE
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1347853
Files : 
* /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
* 
/hadoop/common/branches/branch-0.22/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestRecoveryManager.java


> TestRecoveryManager should not use raw and deprecated configuration 
> parameters.
> ---
>
> Key: MAPREDUCE-4318
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.1
>Reporter: Konstantin Shvachko
>Assignee: Benoy Antony
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-4318.patch
>
>
> TestRecoveryManager should not use deprecated config keys, and should use 
> constants for the keys where possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4289) JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values

2012-06-07 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291548#comment-13291548
 ] 

Devaraj K commented on MAPREDUCE-4289:
--

It is coming due to the hard-coded values in the below code,

{code:title=TypeConverter.java|borderStyle=solid}
public static JobStatus fromYarn(ApplicationReport application,
  String jobFile) {
String trackingUrl = application.getTrackingUrl();
trackingUrl = trackingUrl == null ? "" : trackingUrl;
JobStatus jobStatus =
  new JobStatus(
  TypeConverter.fromYarn(application.getApplicationId()),
  0.0f, 0.0f, 0.0f, 0.0f,
  TypeConverter.fromYarn(application.getYarnApplicationState(), 
  application.getFinalApplicationStatus()),
  org.apache.hadoop.mapreduce.JobPriority.NORMAL,
  application.getUser(), application.getName(),
  application.getQueue(), jobFile, trackingUrl, false
  );
jobStatus.setSchedulingInfo(trackingUrl); // Set AM tracking url

{code} 

Here we don't have any provision to get the map and reduce progresses from RM. 

It doesn't look good to show these hard-coded values always to the user when 
they use JobClient.getAllJobs() API.


Any thoughts?

> JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving 
> any values
> 
>
> Key: MAPREDUCE-4289
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4289
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Nishan Shetty
>
> 1.Run a simple job
> 2.Invoke JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's
> Observe that these API's are giving zeros instead of showing map/reduce 
> progress

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-06-07 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned MAPREDUCE-3902:


Assignee: Siddharth Seth  (was: Arun C Murthy)

> MR AM should reuse containers for map tasks, there-by allowing fine-grained 
> control on num-maps for users without need for CombineFileInputFormat etc.
> --
>
> Key: MAPREDUCE-3902
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster, mrv2
>Reporter: Arun C Murthy
>Assignee: Siddharth Seth
> Attachments: MAPREDUCE-3902.patch
>
>
> The MR AM is now in a great position to reuse containers across (map) tasks. 
> This is something similar to JVM re-use we had in 0.20.x, but in a 
> significantly better manner:
> # Consider data-locality when re-using containers
> # Consider the new shuffle - ensure that reduces fetch output of the whole 
> container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-07 Thread Arun C Murthy (JIRA)
Arun C Murthy created MAPREDUCE-4326:


 Summary: Resurrect RM Restart 
 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha


We should resurrect 'RM Restart' which we disabled sometime during the RM 
refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs

2012-06-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291537#comment-13291537
 ] 

Alejandro Abdelnur commented on MAPREDUCE-3842:
---

+1 built started cluster, no auto refreshing, everything else same same.

> Add a toggle button to all web pages to stop automatic refreshs
> ---
>
> Key: MAPREDUCE-3842
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, webapps
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-3842.patch
>
>
> The automatic refresh makes quiet hard to look at something specific as it 
> makes the page jump and sometime resets its position. 
> This is specially painful when looking at jobs with large number of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors

2012-06-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291492#comment-13291492
 ] 

Jason Lowe commented on MAPREDUCE-4298:
---

This occurred again on one of our clusters.  Turns out I was mistaken earlier, 
the file descriptor ulimit for our nodemanager daemons is set to 32768, not 
8192.  Fortunately this time we were able to examine some nodemanagers that had 
leaked numerous file descriptors but had not fallen over yet.

Almost all of the file descriptors were referencing map outputs for the 
shuffle, often hundreds of file descriptors open to the same file.  
Interestingly almost all of the map files corresponded to just one job.  
Examining the NM log around the time that job ran, I found numerous exceptions 
in it showing things had not gone smoothly during the shuffle for that job.  
For example:

{noformat}
 [New I/O server worker #1-5]java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
at sun.nio.ch.IOUtil.write(IOUtil.java:56)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at 
org.jboss.netty.channel.socket.nio.SocketSendBufferPool$PooledSendBuffer.transferTo(SocketSendBufferPool.java:239)
at 
org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:470)
at 
org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:388)
at 
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137)
at 
org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76)
at 
org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:68)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:253)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:123)
at org.jboss.netty.channel.Channels.write(Channels.java:611)
at org.jboss.netty.channel.Channels.write(Channels.java:578)
at 
org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:477)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:144)
at 
org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:523)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:507)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:444)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:350)
at 
org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201)
at 
org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
{noformat}

Looking closer at the job, I could see that it had run with 15000 maps and 2000 
reduces.  Hundreds of the reducers had failed running out of heap space during 
the shuffle phase, which lead to broken pipe and connection reset errors on the 
nodemanagers trying to serve up shuffle data to those reducers when they died.

I was able to reproduce the broken pipe issue and step through the code with a 
debugger.  Normally the file descriptor is closed by adding a ChannelFuture 
after the map data is written, and that future's operationComplete() callback 
closes the file.  However when there is an I/O error sending the shuffle 
header, Netty closes down the channel automatically (plus we explicitly close 
it in a channel exception handler).  By the time we try to write the map file 
data to the channel, the channel is already closed.  And I was able to see that 
if we write to a closed channel, the ChannelFuture's operationComp

[jira] [Resolved] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.

2012-06-07 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved MAPREDUCE-4318.


   Resolution: Fixed
Fix Version/s: 0.22.1
 Hadoop Flags: Reviewed

I just committed this. Thank you Benoy.

> TestRecoveryManager should not use raw and deprecated configuration 
> parameters.
> ---
>
> Key: MAPREDUCE-4318
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.1
>Reporter: Konstantin Shvachko
>Assignee: Benoy Antony
> Fix For: 0.22.1
>
> Attachments: MAPREDUCE-4318.patch
>
>
> TestRecoveryManager should not use deprecated config keys, and should use 
> constants for the keys where possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4317:


Attachment: MR-4317.patch.v1

Added v1 patch for this - the changes are fairly small and straight-forward.

1. I didn't see any tests checking TaskGraphServlet.
2. Do we need to add a test to verify this behavior? If so, can someone please 
point me to similar existing tests.

> Job view ACL checks are too permissive
> --
>
> Key: MAPREDUCE-4317
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.0.3
>Reporter: Harsh J
>Assignee: Karthik Kambatla
> Attachments: MR-4317.patch.v1
>
>
> The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
> the following internal member:
> {code}private boolean isViewAllowed = true;{code}
> Note that its true.
> Now, in the method that sets proper view-allowed rights, has:
> {code}
> if (user != null && job != null && jt.areACLsEnabled()) {
>   final UserGroupInformation ugi =
> UserGroupInformation.createRemoteUser(user);
>   try {
> ugi.doAs(new PrivilegedExceptionAction() {
>   public Void run() throws IOException, ServletException {
> // checks job view permission
> jt.getACLsManager().checkAccess(job, ugi,
> Operation.VIEW_JOB_DETAILS);
> return null;
>   }
> });
>   } catch (AccessControlException e) {
> String errMsg = "User " + ugi.getShortUserName() +
> " failed to view " + jobid + "!" + e.getMessage() +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   } catch (InterruptedException e) {
> String errMsg = " Interrupted while trying to access " + jobid +
> "Go back to JobTracker";
> JSPUtil.setErrorAndForward(errMsg, request, response);
> myJob.setViewAccess(false);
>   }
> }
> return myJob;
> {code}
> In the above snippet, you can notice that if user==null, which can happen if 
> user is not http-authenticated (as its got via request.getRemoteUser()), can 
> lead to the view being visible since the default is true and we didn't toggle 
> the view to false for user == null case.
> Ideally the default of the view job ACL must be false, or we need an else 
> clause that sets the view rights to false in case of a failure to find the 
> user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.

2012-06-07 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-4318:
---

Summary: TestRecoveryManager should not use raw and deprecated 
configuration parameters.  (was: TestRecoveryManagershould not use raw and 
deprecated configuration parameters.)

+1 Looks good to me.

> TestRecoveryManager should not use raw and deprecated configuration 
> parameters.
> ---
>
> Key: MAPREDUCE-4318
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.1
>Reporter: Konstantin Shvachko
>Assignee: Benoy Antony
> Attachments: MAPREDUCE-4318.patch
>
>
> TestRecoveryManager should not use deprecated config keys, and should use 
> constants for the keys where possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows

2012-06-07 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291451#comment-13291451
 ] 

Ivan Mitic commented on MAPREDUCE-4321:
---

Thanks Daryn, I opened HADOOP-8493 to track this.

> DefaultTaskController fails to launch tasks on Windows
> --
>
> Key: MAPREDUCE-4321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4321-branch-1-win.patch
>
>
> DefaultTaskController#launchTask tries to run the child JVM task with the 
> following command line:
> {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code}
> And this fails because the given path is prefixed with a forward slash. This 
> also causes a number of tests to fail:
> org.apache.hadoop.conf.TestNoDefaultsJobConf
> org.apache.hadoop.fs.TestCopyFiles
> org.apache.hadoop.mapred.TestBadRecords
> org.apache.hadoop.mapred.TestClusterMRNotification
> org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs
> org.apache.hadoop.mapred.TestControlledMapReduceJob
> org.apache.hadoop.mapred.TestCustomOutputCommitter
> org.apache.hadoop.mapred.TestEmptyJob
> org.apache.hadoop.mapred.TestFileOutputFormat
> org.apache.hadoop.mapred.TestIsolationRunner
> org.apache.hadoop.mapred.TestJavaSerialization
> org.apache.hadoop.mapred.TestJobCleanup
> org.apache.hadoop.mapred.TestJobCounters
> org.apache.hadoop.mapred.TestJobHistoryServer
> org.apache.hadoop.mapred.TestJobInProgressListener
> org.apache.hadoop.mapred.TestJobKillAndFail
> org.apache.hadoop.mapred.TestJobName
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4305) Implement delay scheduling in capacity scheduler for improving data locality

2012-06-07 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291444#comment-13291444
 ] 

Konstantin Shvachko commented on MAPREDUCE-4305:


Task locality is important. Interesting that it is only necessary to hook 
Capacity Scheduler up to the logic that already existed in JobInProgress etc. I 
went over the general logic of the patch. It looks good. But I have several 
formatting and code organization comments.
# Append _PROPERTY to new config key constants, e.g. 
NODE_LOCALITY_DELAY_PROPERTY. Looks like other constants in 
CapacitySchedulerConf are like that.
# Bend longs lines.
# In CapacitySchedulerConf convert comments describing variables to a JavaDoc.
# In initializeDefaults() you should use {{capacity-scheduler}} not 
{{fairscheduler}} config variables. Also since you introduced constants for the 
keys, use them rather than the raw keys.
# JobInfo is confusing because there is already a class with that name. Call it 
something like JobLocality. I'd rather move it into JobQueuesManager, because 
the latter maintains the map of those
# Correct indentations in CapacityTaskScheduler, particularly eliminate all 
tabs, should be spaces only.
# Add spaces between arguments, operators, and in some LOG messages.
# Add empty lines between new methods.
# updateLocalityWaitTimes() and updateLastMapLocalityLevel() should belong to 
JobQueuesManager, imo.
# JobQueuesManager.infos is a map keyed with JobInProgress. It'd be better to 
use JobID as a key?
# In TaskSchedulingMgr you need only one version of obtainNewTask to be 
abstract, the one with cachelevel parameter. The other one should not be 
abstract and just call the abstract obtainNewTask() with cachelevel set to any.


> Implement delay scheduling in capacity scheduler for improving data locality
> 
>
> Key: MAPREDUCE-4305
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4305
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: MAPREDUCE-4305, MAPREDUCE-4305-1.patch
>
>
> Capacity Scheduler data local tasks are about 40%-50% which is not good.
> While my test with 70 node cluster i consistently get data locality around 
> 40-50% on a free cluster.
> I think we need to implement something like delay scheduling in the capacity 
> scheduler for improving the data locality.
> http://radlab.cs.berkeley.edu/publication/308
> After implementing the delay scheduling on Hadoop 22 I am getting 100 % data 
> locality in free cluster and around 90% data locality in busy cluster.
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings

2012-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291382#comment-13291382
 ] 

Hadoop QA commented on MAPREDUCE-4311:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12531301/MAPREDUCE-4311.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2446//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2446//console

This message is automatically generated.

> Capacity scheduler.xml does not accept decimal values for capacity and 
> maximum-capacity settings
> 
>
> Key: MAPREDUCE-4311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched, mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Karthik Kambatla
> Attachments: MAPREDUCE-4311.patch
>
>
> if capacity scheduler capacity or max capacity set with decimal it errors:
> - Error starting ResourceManager
> java.lang.NumberFormatException: For input string: "10.5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:458)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297)
> at
> 0.20 used to take decimal and this could be an issue on large clusters that 
> would have queues with small allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings

2012-06-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4311:


Status: Patch Available  (was: Open)

> Capacity scheduler.xml does not accept decimal values for capacity and 
> maximum-capacity settings
> 
>
> Key: MAPREDUCE-4311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched, mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Karthik Kambatla
> Attachments: MAPREDUCE-4311.patch
>
>
> if capacity scheduler capacity or max capacity set with decimal it errors:
> - Error starting ResourceManager
> java.lang.NumberFormatException: For input string: "10.5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:458)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297)
> at
> 0.20 used to take decimal and this could be an issue on large clusters that 
> would have queues with small allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291350#comment-13291350
 ] 

Hadoop QA commented on MAPREDUCE-4306:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12531308/MAPREDUCE-4306.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2445//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2445//console

This message is automatically generated.

> Problem running Distributed Shell applications as a user other than the one 
> started the daemons
> ---
>
> Key: MAPREDUCE-4306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4306.patch
>
>
> Using the tarball, if you start the yarn daemons using one user and then 
> switch to a different user. You can successfully run MR jobs, but DS jobs 
> fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2771) The fs docs should cover mapred.fairscheduler.assignmultiple

2012-06-07 Thread Tomohiko Kinebuchi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291341#comment-13291341
 ] 

Tomohiko Kinebuchi commented on MAPREDUCE-2771:
---

The target page is now here? -> 
http://hadoop.apache.org/common/docs/stable/fair_scheduler.html

> The fs docs should cover mapred.fairscheduler.assignmultiple
> 
>
> Key: MAPREDUCE-2771
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2771
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share, documentation
>Reporter: Eli Collins
>  Labels: newbie
> Fix For: 0.24.0
>
>
> The fs docs should cover the {{mapred.fairscheduler.assignmultiple*}} config 
> options.
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-07 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4306:


Fix Version/s: 2.0.1-alpha
   Status: Patch Available  (was: Open)

> Problem running Distributed Shell applications as a user other than the one 
> started the daemons
> ---
>
> Key: MAPREDUCE-4306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Fix For: 2.0.1-alpha
>
> Attachments: MAPREDUCE-4306.patch
>
>
> Using the tarball, if you start the yarn daemons using one user and then 
> switch to a different user. You can successfully run MR jobs, but DS jobs 
> fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-07 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4306:


Attachment: MAPREDUCE-4306.patch

Here is the patch. I have manually tested it using a single-node cluster. Where 
I started the daemons using one user and then confirmed that a different user 
and the user started the daemons can both successfully run distributed shell 
jobs. 

> Problem running Distributed Shell applications as a user other than the one 
> started the daemons
> ---
>
> Key: MAPREDUCE-4306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-4306.patch
>
>
> Using the tarball, if you start the yarn daemons using one user and then 
> switch to a different user. You can successfully run MR jobs, but DS jobs 
> fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-07 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291321#comment-13291321
 ] 

Ahmed Radwan commented on MAPREDUCE-4306:
-

To reproduce this issue using the tarball on a single node cluster:

1- Start all the daemons using user1.
2- Switch to user2 and try to submit a distributed shell job:
{code}
bin/hadoop jar 
./share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar
 org.apache.hadoop.yarn.applications.distributedshell.Client --jar 
./share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar
 --shell_command ls --num_containers 1 --debug
{code}

I'll be uploading a patch momentarily.

> Problem running Distributed Shell applications as a user other than the one 
> started the daemons
> ---
>
> Key: MAPREDUCE-4306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
>
> Using the tarball, if you start the yarn daemons using one user and then 
> switch to a different user. You can successfully run MR jobs, but DS jobs 
> fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4267) mavenize pipes

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4267:
-

Attachment: MAPREDUCE-4267.patch

fix compilation of 32 bit on 64 bit machine - need to use CXX flags.  don't 
package the pom file.

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, 
> MAPREDUCE-4267.patch, MAPREDUCE-4267.patch, MAPREDUCE-4267.sh
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows

2012-06-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291310#comment-13291310
 ] 

Daryn Sharp commented on MAPREDUCE-4321:


It was a suggestion, so I'm perfectly fine with another jira if you think it 
would be useful.  {{Path}} allows hadoop methods to work seamlessly with either 
local or remote paths.  Adding {{File}} counterparts would be cumbersome, yet 
converting windows files to paths isn't as straight forward.  Someone will 
unknowingly do it wrong in the future and someone else will have to chase it 
down.  A ctor that takes a file reduces but doesn't eliminate the chance 
someone will do it wrong.

> DefaultTaskController fails to launch tasks on Windows
> --
>
> Key: MAPREDUCE-4321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4321-branch-1-win.patch
>
>
> DefaultTaskController#launchTask tries to run the child JVM task with the 
> following command line:
> {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code}
> And this fails because the given path is prefixed with a forward slash. This 
> also causes a number of tests to fail:
> org.apache.hadoop.conf.TestNoDefaultsJobConf
> org.apache.hadoop.fs.TestCopyFiles
> org.apache.hadoop.mapred.TestBadRecords
> org.apache.hadoop.mapred.TestClusterMRNotification
> org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs
> org.apache.hadoop.mapred.TestControlledMapReduceJob
> org.apache.hadoop.mapred.TestCustomOutputCommitter
> org.apache.hadoop.mapred.TestEmptyJob
> org.apache.hadoop.mapred.TestFileOutputFormat
> org.apache.hadoop.mapred.TestIsolationRunner
> org.apache.hadoop.mapred.TestJavaSerialization
> org.apache.hadoop.mapred.TestJobCleanup
> org.apache.hadoop.mapred.TestJobCounters
> org.apache.hadoop.mapred.TestJobHistoryServer
> org.apache.hadoop.mapred.TestJobInProgressListener
> org.apache.hadoop.mapred.TestJobKillAndFail
> org.apache.hadoop.mapred.TestJobName
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings

2012-06-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4311:


Attachment: MAPREDUCE-4311.patch

> Capacity scheduler.xml does not accept decimal values for capacity and 
> maximum-capacity settings
> 
>
> Key: MAPREDUCE-4311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched, mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Karthik Kambatla
> Attachments: MAPREDUCE-4311.patch
>
>
> if capacity scheduler capacity or max capacity set with decimal it errors:
> - Error starting ResourceManager
> java.lang.NumberFormatException: For input string: "10.5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:458)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297)
> at
> 0.20 used to take decimal and this could be an issue on large clusters that 
> would have queues with small allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings

2012-06-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4311:


Status: Patch Available  (was: Open)

Uploading the patch with the following changes:
-Capacities changed to float
-Modified relevant tests to use floating point capacities (10.5)
-Ran the tests - TestCapacityScheduler, TestParentQueue, TestLeafQueue, 
TestRMWebServicesSched

> Capacity scheduler.xml does not accept decimal values for capacity and 
> maximum-capacity settings
> 
>
> Key: MAPREDUCE-4311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched, mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Karthik Kambatla
>
> if capacity scheduler capacity or max capacity set with decimal it errors:
> - Error starting ResourceManager
> java.lang.NumberFormatException: For input string: "10.5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:458)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297)
> at
> 0.20 used to take decimal and this could be an issue on large clusters that 
> would have queues with small allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings

2012-06-07 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4311:


Status: Open  (was: Patch Available)

> Capacity scheduler.xml does not accept decimal values for capacity and 
> maximum-capacity settings
> 
>
> Key: MAPREDUCE-4311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched, mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Karthik Kambatla
>
> if capacity scheduler capacity or max capacity set with decimal it errors:
> - Error starting ResourceManager
> java.lang.NumberFormatException: For input string: "10.5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:458)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297)
> at
> 0.20 used to take decimal and this could be an issue on large clusters that 
> would have queues with small allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy

2012-06-07 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3921:
-

Status: Open  (was: Patch Available)

Sorry to come in late.
Some clarifications:
# MR1 JT kills all running tasks on a TT when it's deemed 'lost'.
# It also kills all completed maps on that TT for 'active' jobs.
# The tasks are marked KILLED rather than FAILED and thus don't count towards 
the job, which is correct since it wasn't the job's fault.

Hope this helps.

> MR AM should act on the nodes liveliness information when nodes go 
> up/down/unhealthy
> 
>
> Key: MAPREDUCE-3921
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am, mrv2
>Affects Versions: 0.23.0
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Bikas Saha
> Fix For: 0.23.2
>
> Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
> MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, MAPREDUCE-3921-6.patch, 
> MAPREDUCE-3921-7.patch, MAPREDUCE-3921-branch-0.23.patch, 
> MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
> MAPREDUCE-3921.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4136) Hadoop streaming might succeed even through reducer fails

2012-06-07 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291260#comment-13291260
 ] 

Matteo Bertozzi commented on MAPREDUCE-4136:


Yes this is fixed with MAPREDUCE-3790, the IOException during 
clientOut._flush()   is now catched.

> Hadoop streaming might succeed even through reducer fails
> -
>
> Key: MAPREDUCE-4136
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4136
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.20.205.0
>Reporter: Wouter de Bie
> Attachments: mapreduce-4136.patch
>
>
> Hadoop streaming can even succeed even though the reducer has failed. This 
> happens when Hadoop calls {{PipeReducer.close()}}, but in the mean time the 
> reducer has failed and the process has died. When {{clientOut_.flush()}} 
> throws an {{IOException}} in {{PipeMapRed.mapRedFinish()}} this exception is 
> caught but only logged. The exit status of the child process is never checked 
> and task is marked as successful.
> I've attached a patch that seems to fix it for us.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2377) task-controller fails to parse configuration if it doesn't end in \n

2012-06-07 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291251#comment-13291251
 ] 

Todd Lipcon commented on MAPREDUCE-2377:


Verified this bug is not present in the MR2 container executor, so marked as 
resolved.

> task-controller fails to parse configuration if it doesn't end in \n
> 
>
> Key: MAPREDUCE-2377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2377
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task-controller
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Benoy Antony
>  Labels: critical-0.22.0
> Fix For: 1.1.0, 0.22.1
>
> Attachments: mr-2377-0.22.patch, mr-2377-20.txt
>
>
> If the task-controller.cfg file doesn't end in a newline, it fails to parse 
> properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2377) task-controller fails to parse configuration if it doesn't end in \n

2012-06-07 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved MAPREDUCE-2377.


Resolution: Fixed

> task-controller fails to parse configuration if it doesn't end in \n
> 
>
> Key: MAPREDUCE-2377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2377
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task-controller
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Benoy Antony
>  Labels: critical-0.22.0
> Fix For: 1.1.0, 0.22.1
>
> Attachments: mr-2377-0.22.patch, mr-2377-20.txt
>
>
> If the task-controller.cfg file doesn't end in a newline, it fails to parse 
> properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows

2012-06-07 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291248#comment-13291248
 ] 

Ivan Mitic commented on MAPREDUCE-4321:
---

Thanks Daryn!

bq. One suggestion to maybe consider, would it in general help to create a 
Path#toFile() and Path(File) ctor? I had consider the additional ctor change on 
HADOOP-8139 and I believe Doug liked the idea.
I also like the idea. Although, one can argue, instead of doing these 
conversions, why not just use File or Path across the board in that scenario 
(RawLocalFileSystem aside). What do you think about doing this in a separate 
Jira so that we can easily pull it out if needed?

> DefaultTaskController fails to launch tasks on Windows
> --
>
> Key: MAPREDUCE-4321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4321-branch-1-win.patch
>
>
> DefaultTaskController#launchTask tries to run the child JVM task with the 
> following command line:
> {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code}
> And this fails because the given path is prefixed with a forward slash. This 
> also causes a number of tests to fail:
> org.apache.hadoop.conf.TestNoDefaultsJobConf
> org.apache.hadoop.fs.TestCopyFiles
> org.apache.hadoop.mapred.TestBadRecords
> org.apache.hadoop.mapred.TestClusterMRNotification
> org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs
> org.apache.hadoop.mapred.TestControlledMapReduceJob
> org.apache.hadoop.mapred.TestCustomOutputCommitter
> org.apache.hadoop.mapred.TestEmptyJob
> org.apache.hadoop.mapred.TestFileOutputFormat
> org.apache.hadoop.mapred.TestIsolationRunner
> org.apache.hadoop.mapred.TestJavaSerialization
> org.apache.hadoop.mapred.TestJobCleanup
> org.apache.hadoop.mapred.TestJobCounters
> org.apache.hadoop.mapred.TestJobHistoryServer
> org.apache.hadoop.mapred.TestJobInProgressListener
> org.apache.hadoop.mapred.TestJobKillAndFail
> org.apache.hadoop.mapred.TestJobName
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Moved] (MAPREDUCE-4325) Rename ProcessTree.isSetsidAvailable

2012-06-07 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha moved HADOOP-8492 to MAPREDUCE-4325:
---

Fix Version/s: (was: 1.1.0)
   1.1.0
Affects Version/s: (was: 1.0.0)
   1.0.0
  Key: MAPREDUCE-4325  (was: HADOOP-8492)
  Project: Hadoop Map/Reduce  (was: Hadoop Common)

> Rename ProcessTree.isSetsidAvailable
> 
>
> Key: MAPREDUCE-4325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Fix For: 1.1.0
>
>
> The logical use of this member is to find out if processes can be grouped 
> into a unit for process manipulation. eg. killing process groups etc.
> setsid is the Linux implementation and it leaks into the name.
> I suggest renaming it to isProcessGroupAvailable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4318) TestRecoveryManagershould not use raw and deprecated configuration parameters.

2012-06-07 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291203#comment-13291203
 ] 

Benoy Antony commented on MAPREDUCE-4318:
-

Th other option was to use the new scheme of specifying mapred-queues.xml 
containing he queue configuration. I used QueueManagerTestUtils class to 
achieve this. But there are other mapred-queues.xml in the classpath which gets 
picked up before test's mapred-queues.xml with different configuration.

These files seem to be created when I build using eclipse and if I remove those 
mapred-queues.xml, then test passes. So this may be an eclipse created problem. 

The old scheme of defining queues does not use mapred-queues.xml and hence will 
work regardless multiple mapred-queues.xml issues.

Since we are not testing Queue management here, I believe, keeping the 
following line makes the test more reliable.

mr.getJobTrackerConf().set(DeprecatedQueueConfigurationParser.MAPRED_QUEUE_NAMES_KEY,
"default");


So I recommend to go with the attached patch. Please let me know if there are 
some other ideas.


> TestRecoveryManagershould not use raw and deprecated configuration parameters.
> --
>
> Key: MAPREDUCE-4318
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.1
>Reporter: Konstantin Shvachko
>Assignee: Benoy Antony
> Attachments: MAPREDUCE-4318.patch
>
>
> TestRecoveryManager should not use deprecated config keys, and should use 
> constants for the keys where possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4260) Investigate use of JobObject to spawn tasks on Windows

2012-06-07 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291193#comment-13291193
 ] 

Bikas Saha commented on MAPREDUCE-4260:
---

1. I agree the check looks a bit weird. But I put that in because I am not sure 
how this will affect existing Cygwin installations where people may not have 
winutils built. I think that is something we need to figure out. The code in 
winutils already has the dependency comment.
2. Will do
3. Again, I am mainly concerned about installations that dont have winutils and 
also to guard against any unexpected use cases that might break. I agree that 
this code should disappear soon.
4. Will do
5. From what I understand this error would come if the 
JOBOBJECT_BASIC_PROCESS_ID_LIST does not have enough space to return 
information about all processes. In that case, one needs to reallocate the 
structure based on the value of NumberOfAssignedProcesses and call 
QueryInformationJobObject() again. Since I am not interested in per process 
information I choose to ignore that error. Let me know if my understanding is 
not accurate.

> Investigate use of JobObject to spawn tasks on Windows
> --
>
> Key: MAPREDUCE-4260
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4260
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 1.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4260.branch-1-win.patch, MAPREDUCE-4260.patch, 
> test.cpp
>
>
> Currently, the Windows version spawns the task as a normal cmd shell from 
> which other downstream exe's are spawned. However, this is not bullet proof 
> because if an intermediate process exits before its child exits, then the 
> parent child process tree relationship cannot be constructed. Windows has a 
> concept of JobObject that is similar to the setsid behavior used in Linux. 
> The initial spawned task could be launched within its JobObject. Thereafter, 
> process termination, memory management etc could be operated on the JobObject.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets

2012-06-07 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291192#comment-13291192
 ] 

Todd Lipcon commented on MAPREDUCE-4323:


See comment on HADOOP-8490: I think the NM should just be side-stepping the FS 
cache, so it can explicitly close the FS when necessary.

> NM leaks sockets
> 
>
> Key: MAPREDUCE-4323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>Reporter: Daryn Sharp
>Priority: Critical
>
> The NM is exhausting its fds because it's not closing fs instances when the 
> app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets

2012-06-07 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291175#comment-13291175
 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-4323:
---

This looks like a problem of the newly added socket cache.  Once it is fixed 
(say, it is removed for the sake of discussion), are there other problems?

> NM leaks sockets
> 
>
> Key: MAPREDUCE-4323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>Reporter: Daryn Sharp
>Priority: Critical
>
> The NM is exhausting its fds because it's not closing fs instances when the 
> app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?

2012-06-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291137#comment-13291137
 ] 

Ashutosh Chauhan commented on MAPREDUCE-4324:
-

As someone working on higher up the stack, I have seen this {{if}} code block 
in all the clients. Ideally, {{jobclient}} should do it, freeing apps from this 
unnecessary requirement. Thanks, Harsh for picking this up! 

> JobClient can perhaps set mapreduce.job.credentials.binary rather than expect 
> its presence?
> ---
>
> Key: MAPREDUCE-4324
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2, security
>Affects Versions: 0.22.0, 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>
> HDFS-1007 added in this requirement property 
> "mapreduce.job.credentials.binary", that has lead Oozie to add the following 
> duplicate snippet to all its Job-launching main classes such as the Pig, 
> Hive, MR and Sqoop actions:
> {code}
> if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
> jobConf.set("mapreduce.job.credentials.binary", 
> System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
> }
> {code}
> Same is required for any client program that launches a job from within a 
> task.
> Why can't this simply be set by the JobClient initialization bits itself? If 
> no one imagines it causing issues, I'd like to add this snippet somewhere in 
> JobSubmitter before it requests NN/JT, as otherwise we'd get…
> {code}
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token 
> can be issued only with kerberos or web authentication 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536)
>  
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)
> at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) 
> at $Proxy6.getDelegationToken(Unknown Source) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>  
> at $Proxy6.getDelegationToken(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605)
>  
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115)
>  
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79)
>  
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) 
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>  
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) 
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) 
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) 
> {code}
> … or similar errors when a user submits a job from a task running in a 
> secured cluster.
> Let me know your thoughts on this!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?

2012-06-07 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated MAPREDUCE-4324:
---

Component/s: security

> JobClient can perhaps set mapreduce.job.credentials.binary rather than expect 
> its presence?
> ---
>
> Key: MAPREDUCE-4324
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2, security
>Affects Versions: 0.22.0, 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>
> HDFS-1007 added in this requirement property 
> "mapreduce.job.credentials.binary", that has lead Oozie to add the following 
> duplicate snippet to all its Job-launching main classes such as the Pig, 
> Hive, MR and Sqoop actions:
> {code}
> if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
> jobConf.set("mapreduce.job.credentials.binary", 
> System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
> }
> {code}
> Same is required for any client program that launches a job from within a 
> task.
> Why can't this simply be set by the JobClient initialization bits itself? If 
> no one imagines it causing issues, I'd like to add this snippet somewhere in 
> JobSubmitter before it requests NN/JT, as otherwise we'd get…
> {code}
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token 
> can be issued only with kerberos or web authentication 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536)
>  
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)
> at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) 
> at $Proxy6.getDelegationToken(Unknown Source) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>  
> at $Proxy6.getDelegationToken(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605)
>  
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115)
>  
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79)
>  
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) 
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>  
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) 
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) 
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) 
> {code}
> … or similar errors when a user submits a job from a task running in a 
> secured cluster.
> Let me know your thoughts on this!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4267) mavenize pipes

2012-06-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291124#comment-13291124
 ] 

Thomas Graves commented on MAPREDUCE-4267:
--

Sorry looks like my previous comment was wrong, need to debug further.

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, 
> MAPREDUCE-4267.patch, MAPREDUCE-4267.sh
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4267) mavenize pipes

2012-06-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291117#comment-13291117
 ] 

Thomas Graves commented on MAPREDUCE-4267:
--

Looks like the bit I added from HADOOP-8489 broke the build of 32 bit when 
building from 64 bit machine.  I'll remove that back out.

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, 
> MAPREDUCE-4267.patch, MAPREDUCE-4267.sh
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4267) mavenize pipes

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4267:
-

Attachment: MAPREDUCE-4267.sh

This should be run before the patch.

./MAPREDUCE-4267.sh svn
patch -p0 < MAPREDUCE-4267.patch

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, 
> MAPREDUCE-4267.patch, MAPREDUCE-4267.sh
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2980) Fetch failures and other related issues in Jetty 6.1.26

2012-06-07 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291112#comment-13291112
 ] 

Todd Lipcon commented on MAPREDUCE-2980:


Still no 6.1.27. We've been shipping the version I linked to from github above: 
https://github.com/toddlipcon/jetty-hadoop-fix/tree/6.1.26.cloudera.1

That, combined with MAPREDUCE-3184 has made the problem quite livable.

We also found that the upgrade from 6.1.26 to the github branch improved 
performance noticeably for shuffle-intensive jobs.

> Fetch failures and other related issues in Jetty 6.1.26
> ---
>
> Key: MAPREDUCE-2980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2980
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.20.205.0, 0.23.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> Since upgrading Jetty from 6.1.14 to 6.1.26 we've had a ton of HTTP-related 
> issues, including:
> - Much higher incidence of fetch failures
> - A few strange file-descriptor related bugs (eg MAPREDUCE-2389)
> - A few unexplained issues where long "fsck"s on the NameNode drop out 
> halfway through with a ClosedChannelException
> Stress tests with 1Map x 1Reduce sleep jobs reliably reproduce fetch 
> failures at a rate of about 1 per million on a 25 node test cluster. These 
> problems are all new since the upgrade from 6.1.14.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4267) mavenize pipes

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4267:
-

Attachment: MAPREDUCE-4267.patch

here is an initial patch.  

files added to the tarball are listed below. I'm using the packaging type of 
pom in the hadoop-pipes pom.xml because all it does it run ant to generate the 
libraries via cmake and there are no jar files.  If anyone has a better way of 
doing this let me know.

I'll see if I can figure out a way to tell it not to package the pom.

 hadoop-3.0.0-SNAPSHOT/include/Pipes.hh
> hadoop-3.0.0-SNAPSHOT/include/SerialUtils.hh
> hadoop-3.0.0-SNAPSHOT/include/StringUtils.hh
> hadoop-3.0.0-SNAPSHOT/include/TemplateFactory.hh
38a43
> hadoop-3.0.0-SNAPSHOT/lib/native/libhadooppipes.a
39a45
> hadoop-3.0.0-SNAPSHOT/lib/native/libhadooputils.a
522a529
> hadoop-3.0.0-SNAPSHOT/share/hadoop/tools/lib/hadoop-pipes-3.0.0-SNAPSHOT.pom

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, 
> MAPREDUCE-4267.patch, MAPREDUCE-4267.sh
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?

2012-06-07 Thread Harsh J (JIRA)
Harsh J created MAPREDUCE-4324:
--

 Summary: JobClient can perhaps set 
mapreduce.job.credentials.binary rather than expect its presence?
 Key: MAPREDUCE-4324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.0.0-alpha, 0.22.0
Reporter: Harsh J
Assignee: Harsh J


HDFS-1007 added in this requirement property 
"mapreduce.job.credentials.binary", that has lead Oozie to add the following 
duplicate snippet to all its Job-launching main classes such as the Pig, Hive, 
MR and Sqoop actions:

{code}
if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
jobConf.set("mapreduce.job.credentials.binary", 
System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
}
{code}

Same is required for any client program that launches a job from within a task.

Why can't this simply be set by the JobClient initialization bits itself? If no 
one imagines it causing issues, I'd like to add this snippet somewhere in 
JobSubmitter before it requests NN/JT, as otherwise we'd get…

{code}
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token 
can be issued only with kerberos or web authentication 
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509)
 
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536)
 
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) 
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428)

at org.apache.hadoop.ipc.Client.call(Client.java:1107) 
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) 
at $Proxy6.getDelegationToken(Unknown Source) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 
at $Proxy6.getDelegationToken(Unknown Source) 
at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605)
 
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115)
 
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79)
 
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) 
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
 
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) 
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) 
{code}

… or similar errors when a user submits a job from a task running in a secured 
cluster.

Let me know your thoughts on this!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets

2012-06-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291078#comment-13291078
 ] 

Daryn Sharp commented on MAPREDUCE-4323:


In particular, {{DFSClient}} maintains a socket cache.  Closed sockets are not 
detected until another connection is needed, or the client is closed.  That's 
another issue, but the NM's failure to close filesystems for a user after the 
app completes causes a leak of sockets in the CLOSE_WAIT state that eventually 
exhaust fds for the process.

Calling {{FileSystem.closeAllForUGI}}, as the JT does, is troublesome that it 
may close the fs for other apps running as that user.  One approach is to 
partition the fs cache to allow each app to maintain its own cache of 
filesystems.  See HADOOP-8490 for possible approaches, which would allow the 
closing of the app's filesystems ala the JT.

Also note that failure to close filesystems causes all future jobs to use the 
configuration of the first job.  This will be very problematic, so it's 
imperative to ensure apps each get their own cached instances.

> NM leaks sockets
> 
>
> Key: MAPREDUCE-4323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>Reporter: Daryn Sharp
>Priority: Critical
>
> The NM is exhausting its fds because it's not closing fs instances when the 
> app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4323) NM leaks sockets

2012-06-07 Thread Daryn Sharp (JIRA)
Daryn Sharp created MAPREDUCE-4323:
--

 Summary: NM leaks sockets
 Key: MAPREDUCE-4323
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.0-alpha, 0.23.0, 0.24.0
Reporter: Daryn Sharp
Priority: Critical


The NM is exhausting its fds because it's not closing fs instances when the app 
is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs

2012-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291057#comment-13291057
 ] 

Hadoop QA commented on MAPREDUCE-3842:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12531265/MAPREDUCE-3842.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2444//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2444//console

This message is automatically generated.

> Add a toggle button to all web pages to stop automatic refreshs
> ---
>
> Key: MAPREDUCE-3842
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, webapps
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-3842.patch
>
>
> The automatic refresh makes quiet hard to look at something specific as it 
> makes the page jump and sometime resets its position. 
> This is specially painful when looking at jobs with large number of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-3842:
-

Target Version/s: 0.23.3
  Status: Patch Available  (was: Open)

patch manually tested.

> Add a toggle button to all web pages to stop automatic refreshs
> ---
>
> Key: MAPREDUCE-3842
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, webapps
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-3842.patch
>
>
> The automatic refresh makes quiet hard to look at something specific as it 
> makes the page jump and sometime resets its position. 
> This is specially painful when looking at jobs with large number of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned MAPREDUCE-3842:


Assignee: Thomas Graves

> Add a toggle button to all web pages to stop automatic refreshs
> ---
>
> Key: MAPREDUCE-3842
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, webapps
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-3842.patch
>
>
> The automatic refresh makes quiet hard to look at something specific as it 
> makes the page jump and sometime resets its position. 
> This is specially painful when looking at jobs with large number of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs

2012-06-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-3842:
-

Attachment: MAPREDUCE-3842.patch

refresh becomes a big issue when trying to debug with large number of 
tasks/attempts.  

This patch removes refresh on all the pages for now.  I think this makes 
behavior consistent across all the pages for now.

> Add a toggle button to all web pages to stop automatic refreshs
> ---
>
> Key: MAPREDUCE-3842
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2, webapps
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Priority: Critical
> Attachments: MAPREDUCE-3842.patch
>
>
> The automatic refresh makes quiet hard to look at something specific as it 
> makes the page jump and sometime resets its position. 
> This is specially painful when looking at jobs with large number of tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows

2012-06-07 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291027#comment-13291027
 ] 

Daryn Sharp commented on MAPREDUCE-4321:


+1 Looks good!  One suggestion to maybe consider, would it in general help to 
create a {{Path#toFile()}} and {{Path(File)}} ctor?  I had consider the 
additional ctor change on HADOOP-8139 and I believe Doug liked the idea.

> DefaultTaskController fails to launch tasks on Windows
> --
>
> Key: MAPREDUCE-4321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4321-branch-1-win.patch
>
>
> DefaultTaskController#launchTask tries to run the child JVM task with the 
> following command line:
> {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code}
> And this fails because the given path is prefixed with a forward slash. This 
> also causes a number of tests to fail:
> org.apache.hadoop.conf.TestNoDefaultsJobConf
> org.apache.hadoop.fs.TestCopyFiles
> org.apache.hadoop.mapred.TestBadRecords
> org.apache.hadoop.mapred.TestClusterMRNotification
> org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs
> org.apache.hadoop.mapred.TestControlledMapReduceJob
> org.apache.hadoop.mapred.TestCustomOutputCommitter
> org.apache.hadoop.mapred.TestEmptyJob
> org.apache.hadoop.mapred.TestFileOutputFormat
> org.apache.hadoop.mapred.TestIsolationRunner
> org.apache.hadoop.mapred.TestJavaSerialization
> org.apache.hadoop.mapred.TestJobCleanup
> org.apache.hadoop.mapred.TestJobCounters
> org.apache.hadoop.mapred.TestJobHistoryServer
> org.apache.hadoop.mapred.TestJobInProgressListener
> org.apache.hadoop.mapred.TestJobKillAndFail
> org.apache.hadoop.mapred.TestJobName
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-07 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291028#comment-13291028
 ] 

Mariappan Asokan commented on MAPREDUCE-2454:
-

The failing test seems to be a flaky one.  Googling on 
{{org.apache.hadoop.mapred.TestReduceFetchFromPartialMem}} shows a lot of hits 
in mapreduce jira.  I will look at the test more closely to see whether it can 
be fixed.  I welcome input from other developers on this.  Meanwhile, I can 
retry the same patch file to see whether this failure goes away magically.


> Allow external sorter plugin for MR
> ---
>
> Key: MAPREDUCE-2454
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Mariappan Asokan
>Priority: Minor
>  Labels: features, performance, plugin, sort
> Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
> MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
> MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
> mr-2454-on-mr-279-build82.patch.gz
>
>
> Define interfaces and some abstract classes in the Hadoop framework to 
> facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4320) gridmix mainClass wrong in pom.xml

2012-06-07 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290991#comment-13290991
 ] 

Thomas Graves commented on MAPREDUCE-4320:
--

The findbugs are known issue - see MAPREDUCE-4239

> gridmix mainClass wrong in pom.xml
> --
>
> Key: MAPREDUCE-4320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/gridmix
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: MAPREDUCE-4320.patch
>
>
> when trying to run gridmix its actually trying to run 
> org.apache.hadoop.tools.HadoopArchives.
> the pom.xml needs to be fixed to have correct mainClass: 
> org.apache.hadoop.mapred.gridmix.Gridmix

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2980) Fetch failures and other related issues in Jetty 6.1.26

2012-06-07 Thread Kang Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290978#comment-13290978
 ] 

Kang Xiao commented on MAPREDUCE-2980:
--

hi Todd, is jetty 6.1.27 released now? Or which version you are using at 
present? Whe downgrade to jetty 6.1.14 but it seems that it cause tasktracker 
memory problem. org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint 
use to much memory in tasktracker.

> Fetch failures and other related issues in Jetty 6.1.26
> ---
>
> Key: MAPREDUCE-2980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2980
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.20.205.0, 0.23.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> Since upgrading Jetty from 6.1.14 to 6.1.26 we've had a ton of HTTP-related 
> issues, including:
> - Much higher incidence of fetch failures
> - A few strange file-descriptor related bugs (eg MAPREDUCE-2389)
> - A few unexplained issues where long "fsck"s on the NameNode drop out 
> halfway through with a ClosedChannelException
> Stress tests with 1Map x 1Reduce sleep jobs reliably reproduce fetch 
> failures at a rate of about 1 per million on a 25 node test cluster. These 
> problems are all new since the upgrade from 6.1.14.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4260) Investigate use of JobObject to spawn tasks on Windows

2012-06-07 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290915#comment-13290915
 ] 

Ivan Mitic commented on MAPREDUCE-4260:
---

This is a great change, thanks Bikas!

A have a few minor questions/suggestions:

1. ProcessTree.java: {{ProcessTree.isSetsidSupported}} Do we have to check for 
"Create a new task..." string existence before we can enable setsid? It is not 
intuitive that one has to change another place in code if it changes winutils 
output. If you still need this check, can you please put a comment in winutils 
to call this out.
2. ProcessTree.java: It might be useful to log if setsid functionality is not 
available.
3. JobConf.java: Is there a scenario where one wouldn't want to use JobObjects 
on Windows?
4. task.c:197 Missing a check on whether {{LocalAlloc}} succeeded
5. task.c:201 Didn't get the motivation for doing {{ERROR_MORE_DATA}} check? 
Will {{procList->NumberOfAssignedProcesses}} be valid in case of this error?


> Investigate use of JobObject to spawn tasks on Windows
> --
>
> Key: MAPREDUCE-4260
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4260
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 1.0.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
> Attachments: MAPREDUCE-4260.branch-1-win.patch, MAPREDUCE-4260.patch, 
> test.cpp
>
>
> Currently, the Windows version spawns the task as a normal cmd shell from 
> which other downstream exe's are spawned. However, this is not bullet proof 
> because if an intermediate process exits before its child exits, then the 
> parent child process tree relationship cannot be constructed. Windows has a 
> concept of JobObject that is similar to the setsid behavior used in Linux. 
> The initial spawned task could be launched within its JobObject. Thereafter, 
> process termination, memory management etc could be operated on the JobObject.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290898#comment-13290898
 ] 

Hadoop QA commented on MAPREDUCE-4290:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12531241/MAPREDUCE-4290.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2443//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2443//console

This message is automatically generated.

> JobStatus.getState() API is giving ambiguous values
> ---
>
> Key: MAPREDUCE-4290
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4290.patch
>
>
> For failed job getState() API is giving status as SUCCEEDED if we use 
> JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-07 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4290:
-

Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

JobClient.getAllJobs() API is giving the status as SUCCEEDED even if the job is 
failed. While converting from application report to Job Status it is 
considering only the yarn application state. If the application state is 
finished and final status is failed, it is giving the job status as SUCCEEDED 
by considering only the application state. 


I have attached patch to address this, if the yarn application status is 
finished and final status is succeeded then only giving the job status as 
succeeded.

> JobStatus.getState() API is giving ambiguous values
> ---
>
> Key: MAPREDUCE-4290
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4290.patch
>
>
> For failed job getState() API is giving status as SUCCEEDED if we use 
> JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-07 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4290:
-

Attachment: MAPREDUCE-4290.patch

> JobStatus.getState() API is giving ambiguous values
> ---
>
> Key: MAPREDUCE-4290
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.0.0-alpha
>Reporter: Nishan Shetty
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4290.patch
>
>
> For failed job getState() API is giving status as SUCCEEDED if we use 
> JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira