[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291555#comment-13291555 ] Devaraj K commented on MAPREDUCE-4288: -- It is coming due to the hard-coded values in the below code, {code:title=ResourceMgrDelegate.java|borderStyle=solid} public ClusterMetrics getClusterMetrics() throws IOException, InterruptedException { GetClusterMetricsRequest request = recordFactory.newRecordInstance(GetClusterMetricsRequest.class); GetClusterMetricsResponse response = applicationsManager.getClusterMetrics(request); YarnClusterMetrics metrics = response.getClusterMetrics(); ClusterMetrics oldMetrics = new ClusterMetrics(1, 1, 1, 1, 1, 1, metrics.getNumNodeManagers() * 10, metrics.getNumNodeManagers() * 2, 1, metrics.getNumNodeManagers(), 0, 0); return oldMetrics; } {code} Here we cannot get runningMaps, runningReduces, occupiedMapSlots...etc from RM because the yarn cluster is completely based on the resources and resource usages. It doesn't look good to show these hard-coded values always to the user when they try to get cluster status using the JobClient.getClusterStatus() API. Any thoughts on this? > ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one > when no job is running > --- > > Key: MAPREDUCE-4288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Nishan Shetty > > When no job is running in the cluster invoke the ClusterStatus.getMapTasks() > and ClusterStatus.getReduceTasks() API's > Observed that these API's are returning one instead of zero(as no job is > running) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4327) Enhance CS to schedule accounting for both memory and cpu cores
[ https://issues.apache.org/jira/browse/MAPREDUCE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291553#comment-13291553 ] Arun C Murthy commented on MAPREDUCE-4327: -- An option to consider for multi-resource scheduling is the approach outlined by Ghodsi et al in the DRF paper: http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-18.pdf > Enhance CS to schedule accounting for both memory and cpu cores > --- > > Key: MAPREDUCE-4327 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv2, resourcemanager, scheduler >Affects Versions: 2.0.0-alpha >Reporter: Arun C Murthy >Assignee: Arun C Murthy > > With YARN being a general purpose system, it would be useful for several > applications (MPI et al) to specify not just memory but also CPU (cores) for > their resource requirements. Thus, it would be useful to the > CapacityScheduler to account for both. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4327) Enhance CS to schedule accounting for both memory and cpu cores
Arun C Murthy created MAPREDUCE-4327: Summary: Enhance CS to schedule accounting for both memory and cpu cores Key: MAPREDUCE-4327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2, resourcemanager, scheduler Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Arun C Murthy With YARN being a general purpose system, it would be useful for several applications (MPI et al) to specify not just memory but also CPU (cores) for their resource requirements. Thus, it would be useful to the CapacityScheduler to account for both. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291550#comment-13291550 ] Hudson commented on MAPREDUCE-4318: --- Integrated in Hadoop-Mapreduce-22-branch #105 (See [https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/105/]) MAPREDUCE-4318. TestRecoveryManager should not use raw configuration keys. Contributed by Benoy Antony. (Revision 1347853) Result = FAILURE shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1347853 Files : * /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt * /hadoop/common/branches/branch-0.22/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestRecoveryManager.java > TestRecoveryManager should not use raw and deprecated configuration > parameters. > --- > > Key: MAPREDUCE-4318 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.22.1 >Reporter: Konstantin Shvachko >Assignee: Benoy Antony > Fix For: 0.22.1 > > Attachments: MAPREDUCE-4318.patch > > > TestRecoveryManager should not use deprecated config keys, and should use > constants for the keys where possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4289) JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291548#comment-13291548 ] Devaraj K commented on MAPREDUCE-4289: -- It is coming due to the hard-coded values in the below code, {code:title=TypeConverter.java|borderStyle=solid} public static JobStatus fromYarn(ApplicationReport application, String jobFile) { String trackingUrl = application.getTrackingUrl(); trackingUrl = trackingUrl == null ? "" : trackingUrl; JobStatus jobStatus = new JobStatus( TypeConverter.fromYarn(application.getApplicationId()), 0.0f, 0.0f, 0.0f, 0.0f, TypeConverter.fromYarn(application.getYarnApplicationState(), application.getFinalApplicationStatus()), org.apache.hadoop.mapreduce.JobPriority.NORMAL, application.getUser(), application.getName(), application.getQueue(), jobFile, trackingUrl, false ); jobStatus.setSchedulingInfo(trackingUrl); // Set AM tracking url {code} Here we don't have any provision to get the map and reduce progresses from RM. It doesn't look good to show these hard-coded values always to the user when they use JobClient.getAllJobs() API. Any thoughts? > JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving > any values > > > Key: MAPREDUCE-4289 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4289 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Nishan Shetty > > 1.Run a simple job > 2.Invoke JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's > Observe that these API's are giving zeros instead of showing map/reduce > progress -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy reassigned MAPREDUCE-3902: Assignee: Siddharth Seth (was: Arun C Murthy) > MR AM should reuse containers for map tasks, there-by allowing fine-grained > control on num-maps for users without need for CombineFileInputFormat etc. > -- > > Key: MAPREDUCE-3902 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, mrv2 >Reporter: Arun C Murthy >Assignee: Siddharth Seth > Attachments: MAPREDUCE-3902.patch > > > The MR AM is now in a great position to reuse containers across (map) tasks. > This is something similar to JVM re-use we had in 0.20.x, but in a > significantly better manner: > # Consider data-locality when re-using containers > # Consider the new shuffle - ensure that reduces fetch output of the whole > container at once (i.e. all maps) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4326) Resurrect RM Restart
Arun C Murthy created MAPREDUCE-4326: Summary: Resurrect RM Restart Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291537#comment-13291537 ] Alejandro Abdelnur commented on MAPREDUCE-3842: --- +1 built started cluster, no auto refreshing, everything else same same. > Add a toggle button to all web pages to stop automatic refreshs > --- > > Key: MAPREDUCE-3842 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2, webapps >Affects Versions: 0.23.1 >Reporter: Alejandro Abdelnur >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-3842.patch > > > The automatic refresh makes quiet hard to look at something specific as it > makes the page jump and sometime resets its position. > This is specially painful when looking at jobs with large number of tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors
[ https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291492#comment-13291492 ] Jason Lowe commented on MAPREDUCE-4298: --- This occurred again on one of our clusters. Turns out I was mistaken earlier, the file descriptor ulimit for our nodemanager daemons is set to 32768, not 8192. Fortunately this time we were able to examine some nodemanagers that had leaked numerous file descriptors but had not fallen over yet. Almost all of the file descriptors were referencing map outputs for the shuffle, often hundreds of file descriptors open to the same file. Interestingly almost all of the map files corresponded to just one job. Examining the NM log around the time that job ran, I found numerous exceptions in it showing things had not gone smoothly during the shuffle for that job. For example: {noformat} [New I/O server worker #1-5]java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100) at sun.nio.ch.IOUtil.write(IOUtil.java:56) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$PooledSendBuffer.transferTo(SocketSendBufferPool.java:239) at org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:470) at org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:388) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76) at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:68) at org.jboss.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:253) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleDownstream(ChunkedWriteHandler.java:123) at org.jboss.netty.channel.Channels.write(Channels.java:611) at org.jboss.netty.channel.Channels.write(Channels.java:578) at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:477) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:144) at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:523) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:507) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:444) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:350) at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201) at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {noformat} Looking closer at the job, I could see that it had run with 15000 maps and 2000 reduces. Hundreds of the reducers had failed running out of heap space during the shuffle phase, which lead to broken pipe and connection reset errors on the nodemanagers trying to serve up shuffle data to those reducers when they died. I was able to reproduce the broken pipe issue and step through the code with a debugger. Normally the file descriptor is closed by adding a ChannelFuture after the map data is written, and that future's operationComplete() callback closes the file. However when there is an I/O error sending the shuffle header, Netty closes down the channel automatically (plus we explicitly close it in a channel exception handler). By the time we try to write the map file data to the channel, the channel is already closed. And I was able to see that if we write to a closed channel, the ChannelFuture's operationComp
[jira] [Resolved] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko resolved MAPREDUCE-4318. Resolution: Fixed Fix Version/s: 0.22.1 Hadoop Flags: Reviewed I just committed this. Thank you Benoy. > TestRecoveryManager should not use raw and deprecated configuration > parameters. > --- > > Key: MAPREDUCE-4318 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.22.1 >Reporter: Konstantin Shvachko >Assignee: Benoy Antony > Fix For: 0.22.1 > > Attachments: MAPREDUCE-4318.patch > > > TestRecoveryManager should not use deprecated config keys, and should use > constants for the keys where possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4317) Job view ACL checks are too permissive
[ https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4317: Attachment: MR-4317.patch.v1 Added v1 patch for this - the changes are fairly small and straight-forward. 1. I didn't see any tests checking TaskGraphServlet. 2. Do we need to add a test to verify this behavior? If so, can someone please point me to similar existing tests. > Job view ACL checks are too permissive > -- > > Key: MAPREDUCE-4317 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1 >Affects Versions: 1.0.3 >Reporter: Harsh J >Assignee: Karthik Kambatla > Attachments: MR-4317.patch.v1 > > > The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has > the following internal member: > {code}private boolean isViewAllowed = true;{code} > Note that its true. > Now, in the method that sets proper view-allowed rights, has: > {code} > if (user != null && job != null && jt.areACLsEnabled()) { > final UserGroupInformation ugi = > UserGroupInformation.createRemoteUser(user); > try { > ugi.doAs(new PrivilegedExceptionAction() { > public Void run() throws IOException, ServletException { > // checks job view permission > jt.getACLsManager().checkAccess(job, ugi, > Operation.VIEW_JOB_DETAILS); > return null; > } > }); > } catch (AccessControlException e) { > String errMsg = "User " + ugi.getShortUserName() + > " failed to view " + jobid + "!" + e.getMessage() + > "Go back to JobTracker"; > JSPUtil.setErrorAndForward(errMsg, request, response); > myJob.setViewAccess(false); > } catch (InterruptedException e) { > String errMsg = " Interrupted while trying to access " + jobid + > "Go back to JobTracker"; > JSPUtil.setErrorAndForward(errMsg, request, response); > myJob.setViewAccess(false); > } > } > return myJob; > {code} > In the above snippet, you can notice that if user==null, which can happen if > user is not http-authenticated (as its got via request.getRemoteUser()), can > lead to the view being visible since the default is true and we didn't toggle > the view to false for user == null case. > Ideally the default of the view job ACL must be false, or we need an else > clause that sets the view rights to false in case of a failure to find the > user ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4318) TestRecoveryManager should not use raw and deprecated configuration parameters.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-4318: --- Summary: TestRecoveryManager should not use raw and deprecated configuration parameters. (was: TestRecoveryManagershould not use raw and deprecated configuration parameters.) +1 Looks good to me. > TestRecoveryManager should not use raw and deprecated configuration > parameters. > --- > > Key: MAPREDUCE-4318 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.22.1 >Reporter: Konstantin Shvachko >Assignee: Benoy Antony > Attachments: MAPREDUCE-4318.patch > > > TestRecoveryManager should not use deprecated config keys, and should use > constants for the keys where possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291451#comment-13291451 ] Ivan Mitic commented on MAPREDUCE-4321: --- Thanks Daryn, I opened HADOOP-8493 to track this. > DefaultTaskController fails to launch tasks on Windows > -- > > Key: MAPREDUCE-4321 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Ivan Mitic >Assignee: Ivan Mitic > Attachments: MAPREDUCE-4321-branch-1-win.patch > > > DefaultTaskController#launchTask tries to run the child JVM task with the > following command line: > {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code} > And this fails because the given path is prefixed with a forward slash. This > also causes a number of tests to fail: > org.apache.hadoop.conf.TestNoDefaultsJobConf > org.apache.hadoop.fs.TestCopyFiles > org.apache.hadoop.mapred.TestBadRecords > org.apache.hadoop.mapred.TestClusterMRNotification > org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs > org.apache.hadoop.mapred.TestControlledMapReduceJob > org.apache.hadoop.mapred.TestCustomOutputCommitter > org.apache.hadoop.mapred.TestEmptyJob > org.apache.hadoop.mapred.TestFileOutputFormat > org.apache.hadoop.mapred.TestIsolationRunner > org.apache.hadoop.mapred.TestJavaSerialization > org.apache.hadoop.mapred.TestJobCleanup > org.apache.hadoop.mapred.TestJobCounters > org.apache.hadoop.mapred.TestJobHistoryServer > org.apache.hadoop.mapred.TestJobInProgressListener > org.apache.hadoop.mapred.TestJobKillAndFail > org.apache.hadoop.mapred.TestJobName > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4305) Implement delay scheduling in capacity scheduler for improving data locality
[ https://issues.apache.org/jira/browse/MAPREDUCE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291444#comment-13291444 ] Konstantin Shvachko commented on MAPREDUCE-4305: Task locality is important. Interesting that it is only necessary to hook Capacity Scheduler up to the logic that already existed in JobInProgress etc. I went over the general logic of the patch. It looks good. But I have several formatting and code organization comments. # Append _PROPERTY to new config key constants, e.g. NODE_LOCALITY_DELAY_PROPERTY. Looks like other constants in CapacitySchedulerConf are like that. # Bend longs lines. # In CapacitySchedulerConf convert comments describing variables to a JavaDoc. # In initializeDefaults() you should use {{capacity-scheduler}} not {{fairscheduler}} config variables. Also since you introduced constants for the keys, use them rather than the raw keys. # JobInfo is confusing because there is already a class with that name. Call it something like JobLocality. I'd rather move it into JobQueuesManager, because the latter maintains the map of those # Correct indentations in CapacityTaskScheduler, particularly eliminate all tabs, should be spaces only. # Add spaces between arguments, operators, and in some LOG messages. # Add empty lines between new methods. # updateLocalityWaitTimes() and updateLastMapLocalityLevel() should belong to JobQueuesManager, imo. # JobQueuesManager.infos is a map keyed with JobInProgress. It'd be better to use JobID as a key? # In TaskSchedulingMgr you need only one version of obtainNewTask to be abstract, the one with cachelevel parameter. The other one should not be abstract and just call the abstract obtainNewTask() with cachelevel set to any. > Implement delay scheduling in capacity scheduler for improving data locality > > > Key: MAPREDUCE-4305 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4305 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: MAPREDUCE-4305, MAPREDUCE-4305-1.patch > > > Capacity Scheduler data local tasks are about 40%-50% which is not good. > While my test with 70 node cluster i consistently get data locality around > 40-50% on a free cluster. > I think we need to implement something like delay scheduling in the capacity > scheduler for improving the data locality. > http://radlab.cs.berkeley.edu/publication/308 > After implementing the delay scheduling on Hadoop 22 I am getting 100 % data > locality in free cluster and around 90% data locality in busy cluster. > Thanks, > Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291382#comment-13291382 ] Hadoop QA commented on MAPREDUCE-4311: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531301/MAPREDUCE-4311.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2446//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2446//console This message is automatically generated. > Capacity scheduler.xml does not accept decimal values for capacity and > maximum-capacity settings > > > Key: MAPREDUCE-4311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched, mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Karthik Kambatla > Attachments: MAPREDUCE-4311.patch > > > if capacity scheduler capacity or max capacity set with decimal it errors: > - Error starting ResourceManager > java.lang.NumberFormatException: For input string: "10.5" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Integer.parseInt(Integer.java:458) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297) > at > 0.20 used to take decimal and this could be an issue on large clusters that > would have queues with small allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4311: Status: Patch Available (was: Open) > Capacity scheduler.xml does not accept decimal values for capacity and > maximum-capacity settings > > > Key: MAPREDUCE-4311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched, mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Karthik Kambatla > Attachments: MAPREDUCE-4311.patch > > > if capacity scheduler capacity or max capacity set with decimal it errors: > - Error starting ResourceManager > java.lang.NumberFormatException: For input string: "10.5" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Integer.parseInt(Integer.java:458) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297) > at > 0.20 used to take decimal and this could be an issue on large clusters that > would have queues with small allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291350#comment-13291350 ] Hadoop QA commented on MAPREDUCE-4306: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531308/MAPREDUCE-4306.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2445//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2445//console This message is automatically generated. > Problem running Distributed Shell applications as a user other than the one > started the daemons > --- > > Key: MAPREDUCE-4306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > Fix For: 2.0.1-alpha > > Attachments: MAPREDUCE-4306.patch > > > Using the tarball, if you start the yarn daemons using one user and then > switch to a different user. You can successfully run MR jobs, but DS jobs > fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2771) The fs docs should cover mapred.fairscheduler.assignmultiple
[ https://issues.apache.org/jira/browse/MAPREDUCE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291341#comment-13291341 ] Tomohiko Kinebuchi commented on MAPREDUCE-2771: --- The target page is now here? -> http://hadoop.apache.org/common/docs/stable/fair_scheduler.html > The fs docs should cover mapred.fairscheduler.assignmultiple > > > Key: MAPREDUCE-2771 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2771 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share, documentation >Reporter: Eli Collins > Labels: newbie > Fix For: 0.24.0 > > > The fs docs should cover the {{mapred.fairscheduler.assignmultiple*}} config > options. > http://hadoop.apache.org/common/docs/current/fair_scheduler.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4306: Fix Version/s: 2.0.1-alpha Status: Patch Available (was: Open) > Problem running Distributed Shell applications as a user other than the one > started the daemons > --- > > Key: MAPREDUCE-4306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > Fix For: 2.0.1-alpha > > Attachments: MAPREDUCE-4306.patch > > > Using the tarball, if you start the yarn daemons using one user and then > switch to a different user. You can successfully run MR jobs, but DS jobs > fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4306: Attachment: MAPREDUCE-4306.patch Here is the patch. I have manually tested it using a single-node cluster. Where I started the daemons using one user and then confirmed that a different user and the user started the daemons can both successfully run distributed shell jobs. > Problem running Distributed Shell applications as a user other than the one > started the daemons > --- > > Key: MAPREDUCE-4306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > Attachments: MAPREDUCE-4306.patch > > > Using the tarball, if you start the yarn daemons using one user and then > switch to a different user. You can successfully run MR jobs, but DS jobs > fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291321#comment-13291321 ] Ahmed Radwan commented on MAPREDUCE-4306: - To reproduce this issue using the tarball on a single node cluster: 1- Start all the daemons using user1. 2- Switch to user2 and try to submit a distributed shell job: {code} bin/hadoop jar ./share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar org.apache.hadoop.yarn.applications.distributedshell.Client --jar ./share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-3.0.0-SNAPSHOT.jar --shell_command ls --num_containers 1 --debug {code} I'll be uploading a patch momentarily. > Problem running Distributed Shell applications as a user other than the one > started the daemons > --- > > Key: MAPREDUCE-4306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Ahmed Radwan >Assignee: Ahmed Radwan > > Using the tarball, if you start the yarn daemons using one user and then > switch to a different user. You can successfully run MR jobs, but DS jobs > fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4267) mavenize pipes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4267: - Attachment: MAPREDUCE-4267.patch fix compilation of 32 bit on 64 bit machine - need to use CXX flags. don't package the pom file. > mavenize pipes > -- > > Key: MAPREDUCE-4267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-4267.001.rm.patch, > MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, > MAPREDUCE-4267.patch, MAPREDUCE-4267.patch, MAPREDUCE-4267.sh > > > We are still building pipes out of the old mrv1 directories using ant. Move > it over to the mrv2 dir structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291310#comment-13291310 ] Daryn Sharp commented on MAPREDUCE-4321: It was a suggestion, so I'm perfectly fine with another jira if you think it would be useful. {{Path}} allows hadoop methods to work seamlessly with either local or remote paths. Adding {{File}} counterparts would be cumbersome, yet converting windows files to paths isn't as straight forward. Someone will unknowingly do it wrong in the future and someone else will have to chase it down. A ctor that takes a file reduces but doesn't eliminate the chance someone will do it wrong. > DefaultTaskController fails to launch tasks on Windows > -- > > Key: MAPREDUCE-4321 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Ivan Mitic >Assignee: Ivan Mitic > Attachments: MAPREDUCE-4321-branch-1-win.patch > > > DefaultTaskController#launchTask tries to run the child JVM task with the > following command line: > {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code} > And this fails because the given path is prefixed with a forward slash. This > also causes a number of tests to fail: > org.apache.hadoop.conf.TestNoDefaultsJobConf > org.apache.hadoop.fs.TestCopyFiles > org.apache.hadoop.mapred.TestBadRecords > org.apache.hadoop.mapred.TestClusterMRNotification > org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs > org.apache.hadoop.mapred.TestControlledMapReduceJob > org.apache.hadoop.mapred.TestCustomOutputCommitter > org.apache.hadoop.mapred.TestEmptyJob > org.apache.hadoop.mapred.TestFileOutputFormat > org.apache.hadoop.mapred.TestIsolationRunner > org.apache.hadoop.mapred.TestJavaSerialization > org.apache.hadoop.mapred.TestJobCleanup > org.apache.hadoop.mapred.TestJobCounters > org.apache.hadoop.mapred.TestJobHistoryServer > org.apache.hadoop.mapred.TestJobInProgressListener > org.apache.hadoop.mapred.TestJobKillAndFail > org.apache.hadoop.mapred.TestJobName > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4311: Attachment: MAPREDUCE-4311.patch > Capacity scheduler.xml does not accept decimal values for capacity and > maximum-capacity settings > > > Key: MAPREDUCE-4311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched, mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Karthik Kambatla > Attachments: MAPREDUCE-4311.patch > > > if capacity scheduler capacity or max capacity set with decimal it errors: > - Error starting ResourceManager > java.lang.NumberFormatException: For input string: "10.5" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Integer.parseInt(Integer.java:458) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297) > at > 0.20 used to take decimal and this could be an issue on large clusters that > would have queues with small allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4311: Status: Patch Available (was: Open) Uploading the patch with the following changes: -Capacities changed to float -Modified relevant tests to use floating point capacities (10.5) -Ran the tests - TestCapacityScheduler, TestParentQueue, TestLeafQueue, TestRMWebServicesSched > Capacity scheduler.xml does not accept decimal values for capacity and > maximum-capacity settings > > > Key: MAPREDUCE-4311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched, mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Karthik Kambatla > > if capacity scheduler capacity or max capacity set with decimal it errors: > - Error starting ResourceManager > java.lang.NumberFormatException: For input string: "10.5" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Integer.parseInt(Integer.java:458) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297) > at > 0.20 used to take decimal and this could be an issue on large clusters that > would have queues with small allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4311) Capacity scheduler.xml does not accept decimal values for capacity and maximum-capacity settings
[ https://issues.apache.org/jira/browse/MAPREDUCE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4311: Status: Open (was: Patch Available) > Capacity scheduler.xml does not accept decimal values for capacity and > maximum-capacity settings > > > Key: MAPREDUCE-4311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4311 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/capacity-sched, mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Karthik Kambatla > > if capacity scheduler capacity or max capacity set with decimal it errors: > - Error starting ResourceManager > java.lang.NumberFormatException: For input string: "10.5" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) > at java.lang.Integer.parseInt(Integer.java:458) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:713) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:297) > at > 0.20 used to take decimal and this could be an issue on large clusters that > would have queues with small allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3921) MR AM should act on the nodes liveliness information when nodes go up/down/unhealthy
[ https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-3921: - Status: Open (was: Patch Available) Sorry to come in late. Some clarifications: # MR1 JT kills all running tasks on a TT when it's deemed 'lost'. # It also kills all completed maps on that TT for 'active' jobs. # The tasks are marked KILLED rather than FAILED and thus don't count towards the job, which is correct since it wasn't the job's fault. Hope this helps. > MR AM should act on the nodes liveliness information when nodes go > up/down/unhealthy > > > Key: MAPREDUCE-3921 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 >Affects Versions: 0.23.0 >Reporter: Vinod Kumar Vavilapalli >Assignee: Bikas Saha > Fix For: 0.23.2 > > Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, > MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, MAPREDUCE-3921-6.patch, > MAPREDUCE-3921-7.patch, MAPREDUCE-3921-branch-0.23.patch, > MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, > MAPREDUCE-3921.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4136) Hadoop streaming might succeed even through reducer fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291260#comment-13291260 ] Matteo Bertozzi commented on MAPREDUCE-4136: Yes this is fixed with MAPREDUCE-3790, the IOException during clientOut._flush() is now catched. > Hadoop streaming might succeed even through reducer fails > - > > Key: MAPREDUCE-4136 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4136 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming >Affects Versions: 0.20.205.0 >Reporter: Wouter de Bie > Attachments: mapreduce-4136.patch > > > Hadoop streaming can even succeed even though the reducer has failed. This > happens when Hadoop calls {{PipeReducer.close()}}, but in the mean time the > reducer has failed and the process has died. When {{clientOut_.flush()}} > throws an {{IOException}} in {{PipeMapRed.mapRedFinish()}} this exception is > caught but only logged. The exit status of the child process is never checked > and task is marked as successful. > I've attached a patch that seems to fix it for us. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2377) task-controller fails to parse configuration if it doesn't end in \n
[ https://issues.apache.org/jira/browse/MAPREDUCE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291251#comment-13291251 ] Todd Lipcon commented on MAPREDUCE-2377: Verified this bug is not present in the MR2 container executor, so marked as resolved. > task-controller fails to parse configuration if it doesn't end in \n > > > Key: MAPREDUCE-2377 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2377 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task-controller >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Benoy Antony > Labels: critical-0.22.0 > Fix For: 1.1.0, 0.22.1 > > Attachments: mr-2377-0.22.patch, mr-2377-20.txt > > > If the task-controller.cfg file doesn't end in a newline, it fails to parse > properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-2377) task-controller fails to parse configuration if it doesn't end in \n
[ https://issues.apache.org/jira/browse/MAPREDUCE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved MAPREDUCE-2377. Resolution: Fixed > task-controller fails to parse configuration if it doesn't end in \n > > > Key: MAPREDUCE-2377 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2377 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task-controller >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Benoy Antony > Labels: critical-0.22.0 > Fix For: 1.1.0, 0.22.1 > > Attachments: mr-2377-0.22.patch, mr-2377-20.txt > > > If the task-controller.cfg file doesn't end in a newline, it fails to parse > properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291248#comment-13291248 ] Ivan Mitic commented on MAPREDUCE-4321: --- Thanks Daryn! bq. One suggestion to maybe consider, would it in general help to create a Path#toFile() and Path(File) ctor? I had consider the additional ctor change on HADOOP-8139 and I believe Doug liked the idea. I also like the idea. Although, one can argue, instead of doing these conversions, why not just use File or Path across the board in that scenario (RawLocalFileSystem aside). What do you think about doing this in a separate Jira so that we can easily pull it out if needed? > DefaultTaskController fails to launch tasks on Windows > -- > > Key: MAPREDUCE-4321 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Ivan Mitic >Assignee: Ivan Mitic > Attachments: MAPREDUCE-4321-branch-1-win.patch > > > DefaultTaskController#launchTask tries to run the child JVM task with the > following command line: > {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code} > And this fails because the given path is prefixed with a forward slash. This > also causes a number of tests to fail: > org.apache.hadoop.conf.TestNoDefaultsJobConf > org.apache.hadoop.fs.TestCopyFiles > org.apache.hadoop.mapred.TestBadRecords > org.apache.hadoop.mapred.TestClusterMRNotification > org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs > org.apache.hadoop.mapred.TestControlledMapReduceJob > org.apache.hadoop.mapred.TestCustomOutputCommitter > org.apache.hadoop.mapred.TestEmptyJob > org.apache.hadoop.mapred.TestFileOutputFormat > org.apache.hadoop.mapred.TestIsolationRunner > org.apache.hadoop.mapred.TestJavaSerialization > org.apache.hadoop.mapred.TestJobCleanup > org.apache.hadoop.mapred.TestJobCounters > org.apache.hadoop.mapred.TestJobHistoryServer > org.apache.hadoop.mapred.TestJobInProgressListener > org.apache.hadoop.mapred.TestJobKillAndFail > org.apache.hadoop.mapred.TestJobName > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (MAPREDUCE-4325) Rename ProcessTree.isSetsidAvailable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha moved HADOOP-8492 to MAPREDUCE-4325: --- Fix Version/s: (was: 1.1.0) 1.1.0 Affects Version/s: (was: 1.0.0) 1.0.0 Key: MAPREDUCE-4325 (was: HADOOP-8492) Project: Hadoop Map/Reduce (was: Hadoop Common) > Rename ProcessTree.isSetsidAvailable > > > Key: MAPREDUCE-4325 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4325 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Bikas Saha >Assignee: Bikas Saha > Fix For: 1.1.0 > > > The logical use of this member is to find out if processes can be grouped > into a unit for process manipulation. eg. killing process groups etc. > setsid is the Linux implementation and it leaks into the name. > I suggest renaming it to isProcessGroupAvailable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4318) TestRecoveryManagershould not use raw and deprecated configuration parameters.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291203#comment-13291203 ] Benoy Antony commented on MAPREDUCE-4318: - Th other option was to use the new scheme of specifying mapred-queues.xml containing he queue configuration. I used QueueManagerTestUtils class to achieve this. But there are other mapred-queues.xml in the classpath which gets picked up before test's mapred-queues.xml with different configuration. These files seem to be created when I build using eclipse and if I remove those mapred-queues.xml, then test passes. So this may be an eclipse created problem. The old scheme of defining queues does not use mapred-queues.xml and hence will work regardless multiple mapred-queues.xml issues. Since we are not testing Queue management here, I believe, keeping the following line makes the test more reliable. mr.getJobTrackerConf().set(DeprecatedQueueConfigurationParser.MAPRED_QUEUE_NAMES_KEY, "default"); So I recommend to go with the attached patch. Please let me know if there are some other ideas. > TestRecoveryManagershould not use raw and deprecated configuration parameters. > -- > > Key: MAPREDUCE-4318 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4318 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 0.22.1 >Reporter: Konstantin Shvachko >Assignee: Benoy Antony > Attachments: MAPREDUCE-4318.patch > > > TestRecoveryManager should not use deprecated config keys, and should use > constants for the keys where possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4260) Investigate use of JobObject to spawn tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291193#comment-13291193 ] Bikas Saha commented on MAPREDUCE-4260: --- 1. I agree the check looks a bit weird. But I put that in because I am not sure how this will affect existing Cygwin installations where people may not have winutils built. I think that is something we need to figure out. The code in winutils already has the dependency comment. 2. Will do 3. Again, I am mainly concerned about installations that dont have winutils and also to guard against any unexpected use cases that might break. I agree that this code should disappear soon. 4. Will do 5. From what I understand this error would come if the JOBOBJECT_BASIC_PROCESS_ID_LIST does not have enough space to return information about all processes. In that case, one needs to reallocate the structure based on the value of NumberOfAssignedProcesses and call QueryInformationJobObject() again. Since I am not interested in per process information I choose to ignore that error. Let me know if my understanding is not accurate. > Investigate use of JobObject to spawn tasks on Windows > -- > > Key: MAPREDUCE-4260 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4260 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Bikas Saha >Assignee: Bikas Saha > Attachments: MAPREDUCE-4260.branch-1-win.patch, MAPREDUCE-4260.patch, > test.cpp > > > Currently, the Windows version spawns the task as a normal cmd shell from > which other downstream exe's are spawned. However, this is not bullet proof > because if an intermediate process exits before its child exits, then the > parent child process tree relationship cannot be constructed. Windows has a > concept of JobObject that is similar to the setsid behavior used in Linux. > The initial spawned task could be launched within its JobObject. Thereafter, > process termination, memory management etc could be operated on the JobObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets
[ https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291192#comment-13291192 ] Todd Lipcon commented on MAPREDUCE-4323: See comment on HADOOP-8490: I think the NM should just be side-stepping the FS cache, so it can explicitly close the FS when necessary. > NM leaks sockets > > > Key: MAPREDUCE-4323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: nodemanager >Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha >Reporter: Daryn Sharp >Priority: Critical > > The NM is exhausting its fds because it's not closing fs instances when the > app is finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets
[ https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291175#comment-13291175 ] Tsz Wo (Nicholas), SZE commented on MAPREDUCE-4323: --- This looks like a problem of the newly added socket cache. Once it is fixed (say, it is removed for the sake of discussion), are there other problems? > NM leaks sockets > > > Key: MAPREDUCE-4323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: nodemanager >Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha >Reporter: Daryn Sharp >Priority: Critical > > The NM is exhausting its fds because it's not closing fs instances when the > app is finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?
[ https://issues.apache.org/jira/browse/MAPREDUCE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291137#comment-13291137 ] Ashutosh Chauhan commented on MAPREDUCE-4324: - As someone working on higher up the stack, I have seen this {{if}} code block in all the clients. Ideally, {{jobclient}} should do it, freeing apps from this unnecessary requirement. Thanks, Harsh for picking this up! > JobClient can perhaps set mapreduce.job.credentials.binary rather than expect > its presence? > --- > > Key: MAPREDUCE-4324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2, security >Affects Versions: 0.22.0, 2.0.0-alpha >Reporter: Harsh J >Assignee: Harsh J > > HDFS-1007 added in this requirement property > "mapreduce.job.credentials.binary", that has lead Oozie to add the following > duplicate snippet to all its Job-launching main classes such as the Pig, > Hive, MR and Sqoop actions: > {code} > if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) { > jobConf.set("mapreduce.job.credentials.binary", > System.getenv("HADOOP_TOKEN_FILE_LOCATION")); > } > {code} > Same is required for any client program that launches a job from within a > task. > Why can't this simply be set by the JobClient initialization bits itself? If > no one imagines it causing issues, I'd like to add this snippet somewhere in > JobSubmitter before it requests NN/JT, as otherwise we'd get… > {code} > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token > can be issued only with kerberos or web authentication > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509) > > at > org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536) > > at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) > at org.apache.hadoop.ipc.Client.call(Client.java:1107) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) > at $Proxy6.getDelegationToken(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > > at $Proxy6.getDelegationToken(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605) > > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115) > > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) > {code} > … or similar errors when a user submits a job from a task running in a > secured cluster. > Let me know your thoughts on this! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?
[ https://issues.apache.org/jira/browse/MAPREDUCE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4324: --- Component/s: security > JobClient can perhaps set mapreduce.job.credentials.binary rather than expect > its presence? > --- > > Key: MAPREDUCE-4324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2, security >Affects Versions: 0.22.0, 2.0.0-alpha >Reporter: Harsh J >Assignee: Harsh J > > HDFS-1007 added in this requirement property > "mapreduce.job.credentials.binary", that has lead Oozie to add the following > duplicate snippet to all its Job-launching main classes such as the Pig, > Hive, MR and Sqoop actions: > {code} > if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) { > jobConf.set("mapreduce.job.credentials.binary", > System.getenv("HADOOP_TOKEN_FILE_LOCATION")); > } > {code} > Same is required for any client program that launches a job from within a > task. > Why can't this simply be set by the JobClient initialization bits itself? If > no one imagines it causing issues, I'd like to add this snippet somewhere in > JobSubmitter before it requests NN/JT, as otherwise we'd get… > {code} > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token > can be issued only with kerberos or web authentication > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509) > > at > org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536) > > at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) > at org.apache.hadoop.ipc.Client.call(Client.java:1107) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) > at $Proxy6.getDelegationToken(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > > at $Proxy6.getDelegationToken(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605) > > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115) > > at > org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) > {code} > … or similar errors when a user submits a job from a task running in a > secured cluster. > Let me know your thoughts on this! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4267) mavenize pipes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291124#comment-13291124 ] Thomas Graves commented on MAPREDUCE-4267: -- Sorry looks like my previous comment was wrong, need to debug further. > mavenize pipes > -- > > Key: MAPREDUCE-4267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-4267.001.rm.patch, > MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, > MAPREDUCE-4267.patch, MAPREDUCE-4267.sh > > > We are still building pipes out of the old mrv1 directories using ant. Move > it over to the mrv2 dir structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4267) mavenize pipes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291117#comment-13291117 ] Thomas Graves commented on MAPREDUCE-4267: -- Looks like the bit I added from HADOOP-8489 broke the build of 32 bit when building from 64 bit machine. I'll remove that back out. > mavenize pipes > -- > > Key: MAPREDUCE-4267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-4267.001.rm.patch, > MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, > MAPREDUCE-4267.patch, MAPREDUCE-4267.sh > > > We are still building pipes out of the old mrv1 directories using ant. Move > it over to the mrv2 dir structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4267) mavenize pipes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4267: - Attachment: MAPREDUCE-4267.sh This should be run before the patch. ./MAPREDUCE-4267.sh svn patch -p0 < MAPREDUCE-4267.patch > mavenize pipes > -- > > Key: MAPREDUCE-4267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-4267.001.rm.patch, > MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, > MAPREDUCE-4267.patch, MAPREDUCE-4267.sh > > > We are still building pipes out of the old mrv1 directories using ant. Move > it over to the mrv2 dir structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2980) Fetch failures and other related issues in Jetty 6.1.26
[ https://issues.apache.org/jira/browse/MAPREDUCE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291112#comment-13291112 ] Todd Lipcon commented on MAPREDUCE-2980: Still no 6.1.27. We've been shipping the version I linked to from github above: https://github.com/toddlipcon/jetty-hadoop-fix/tree/6.1.26.cloudera.1 That, combined with MAPREDUCE-3184 has made the problem quite livable. We also found that the upgrade from 6.1.26 to the github branch improved performance noticeably for shuffle-intensive jobs. > Fetch failures and other related issues in Jetty 6.1.26 > --- > > Key: MAPREDUCE-2980 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2980 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.205.0, 0.23.0 >Reporter: Todd Lipcon >Priority: Critical > > Since upgrading Jetty from 6.1.14 to 6.1.26 we've had a ton of HTTP-related > issues, including: > - Much higher incidence of fetch failures > - A few strange file-descriptor related bugs (eg MAPREDUCE-2389) > - A few unexplained issues where long "fsck"s on the NameNode drop out > halfway through with a ClosedChannelException > Stress tests with 1Map x 1Reduce sleep jobs reliably reproduce fetch > failures at a rate of about 1 per million on a 25 node test cluster. These > problems are all new since the upgrade from 6.1.14. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4267) mavenize pipes
[ https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4267: - Attachment: MAPREDUCE-4267.patch here is an initial patch. files added to the tarball are listed below. I'm using the packaging type of pom in the hadoop-pipes pom.xml because all it does it run ant to generate the libraries via cmake and there are no jar files. If anyone has a better way of doing this let me know. I'll see if I can figure out a way to tell it not to package the pom. hadoop-3.0.0-SNAPSHOT/include/Pipes.hh > hadoop-3.0.0-SNAPSHOT/include/SerialUtils.hh > hadoop-3.0.0-SNAPSHOT/include/StringUtils.hh > hadoop-3.0.0-SNAPSHOT/include/TemplateFactory.hh 38a43 > hadoop-3.0.0-SNAPSHOT/lib/native/libhadooppipes.a 39a45 > hadoop-3.0.0-SNAPSHOT/lib/native/libhadooputils.a 522a529 > hadoop-3.0.0-SNAPSHOT/share/hadoop/tools/lib/hadoop-pipes-3.0.0-SNAPSHOT.pom > mavenize pipes > -- > > Key: MAPREDUCE-4267 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-4267.001.rm.patch, > MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch, > MAPREDUCE-4267.patch, MAPREDUCE-4267.sh > > > We are still building pipes out of the old mrv1 directories using ant. Move > it over to the mrv2 dir structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4324) JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence?
Harsh J created MAPREDUCE-4324: -- Summary: JobClient can perhaps set mapreduce.job.credentials.binary rather than expect its presence? Key: MAPREDUCE-4324 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4324 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 2.0.0-alpha, 0.22.0 Reporter: Harsh J Assignee: Harsh J HDFS-1007 added in this requirement property "mapreduce.job.credentials.binary", that has lead Oozie to add the following duplicate snippet to all its Job-launching main classes such as the Pig, Hive, MR and Sqoop actions: {code} if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) { jobConf.set("mapreduce.job.credentials.binary", System.getenv("HADOOP_TOKEN_FILE_LOCATION")); } {code} Same is required for any client program that launches a job from within a task. Why can't this simply be set by the JobClient initialization bits itself? If no one imagines it causing issues, I'd like to add this snippet somewhere in JobSubmitter before it requests NN/JT, as otherwise we'd get… {code} org.apache.hadoop.ipc.RemoteException: java.io.IOException: Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5509) at org.apache.hadoop.hdfs.server.namenode.NameNode.getDelegationToken(NameNode.java:536) at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy6.getDelegationToken(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy6.getDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:331) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:605) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:115) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:851) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242) {code} … or similar errors when a user submits a job from a task running in a secured cluster. Let me know your thoughts on this! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets
[ https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291078#comment-13291078 ] Daryn Sharp commented on MAPREDUCE-4323: In particular, {{DFSClient}} maintains a socket cache. Closed sockets are not detected until another connection is needed, or the client is closed. That's another issue, but the NM's failure to close filesystems for a user after the app completes causes a leak of sockets in the CLOSE_WAIT state that eventually exhaust fds for the process. Calling {{FileSystem.closeAllForUGI}}, as the JT does, is troublesome that it may close the fs for other apps running as that user. One approach is to partition the fs cache to allow each app to maintain its own cache of filesystems. See HADOOP-8490 for possible approaches, which would allow the closing of the app's filesystems ala the JT. Also note that failure to close filesystems causes all future jobs to use the configuration of the first job. This will be very problematic, so it's imperative to ensure apps each get their own cached instances. > NM leaks sockets > > > Key: MAPREDUCE-4323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: nodemanager >Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha >Reporter: Daryn Sharp >Priority: Critical > > The NM is exhausting its fds because it's not closing fs instances when the > app is finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4323) NM leaks sockets
Daryn Sharp created MAPREDUCE-4323: -- Summary: NM leaks sockets Key: MAPREDUCE-4323 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 2.0.0-alpha, 0.23.0, 0.24.0 Reporter: Daryn Sharp Priority: Critical The NM is exhausting its fds because it's not closing fs instances when the app is finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291057#comment-13291057 ] Hadoop QA commented on MAPREDUCE-3842: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531265/MAPREDUCE-3842.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2444//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2444//console This message is automatically generated. > Add a toggle button to all web pages to stop automatic refreshs > --- > > Key: MAPREDUCE-3842 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2, webapps >Affects Versions: 0.23.1 >Reporter: Alejandro Abdelnur >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-3842.patch > > > The automatic refresh makes quiet hard to look at something specific as it > makes the page jump and sometime resets its position. > This is specially painful when looking at jobs with large number of tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3842: - Target Version/s: 0.23.3 Status: Patch Available (was: Open) patch manually tested. > Add a toggle button to all web pages to stop automatic refreshs > --- > > Key: MAPREDUCE-3842 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2, webapps >Affects Versions: 0.23.1 >Reporter: Alejandro Abdelnur >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-3842.patch > > > The automatic refresh makes quiet hard to look at something specific as it > makes the page jump and sometime resets its position. > This is specially painful when looking at jobs with large number of tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned MAPREDUCE-3842: Assignee: Thomas Graves > Add a toggle button to all web pages to stop automatic refreshs > --- > > Key: MAPREDUCE-3842 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2, webapps >Affects Versions: 0.23.1 >Reporter: Alejandro Abdelnur >Assignee: Thomas Graves >Priority: Critical > Attachments: MAPREDUCE-3842.patch > > > The automatic refresh makes quiet hard to look at something specific as it > makes the page jump and sometime resets its position. > This is specially painful when looking at jobs with large number of tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3842) Add a toggle button to all web pages to stop automatic refreshs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3842: - Attachment: MAPREDUCE-3842.patch refresh becomes a big issue when trying to debug with large number of tasks/attempts. This patch removes refresh on all the pages for now. I think this makes behavior consistent across all the pages for now. > Add a toggle button to all web pages to stop automatic refreshs > --- > > Key: MAPREDUCE-3842 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3842 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2, webapps >Affects Versions: 0.23.1 >Reporter: Alejandro Abdelnur >Priority: Critical > Attachments: MAPREDUCE-3842.patch > > > The automatic refresh makes quiet hard to look at something specific as it > makes the page jump and sometime resets its position. > This is specially painful when looking at jobs with large number of tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4321) DefaultTaskController fails to launch tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291027#comment-13291027 ] Daryn Sharp commented on MAPREDUCE-4321: +1 Looks good! One suggestion to maybe consider, would it in general help to create a {{Path#toFile()}} and {{Path(File)}} ctor? I had consider the additional ctor change on HADOOP-8139 and I believe Doug liked the idea. > DefaultTaskController fails to launch tasks on Windows > -- > > Key: MAPREDUCE-4321 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4321 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Ivan Mitic >Assignee: Ivan Mitic > Attachments: MAPREDUCE-4321-branch-1-win.patch > > > DefaultTaskController#launchTask tries to run the child JVM task with the > following command line: > {code}cmd.exe /c /c:/some/path.../taskjvm.cmd{code} > And this fails because the given path is prefixed with a forward slash. This > also causes a number of tests to fail: > org.apache.hadoop.conf.TestNoDefaultsJobConf > org.apache.hadoop.fs.TestCopyFiles > org.apache.hadoop.mapred.TestBadRecords > org.apache.hadoop.mapred.TestClusterMRNotification > org.apache.hadoop.mapred.TestCompressedEmptyMapOutputs > org.apache.hadoop.mapred.TestControlledMapReduceJob > org.apache.hadoop.mapred.TestCustomOutputCommitter > org.apache.hadoop.mapred.TestEmptyJob > org.apache.hadoop.mapred.TestFileOutputFormat > org.apache.hadoop.mapred.TestIsolationRunner > org.apache.hadoop.mapred.TestJavaSerialization > org.apache.hadoop.mapred.TestJobCleanup > org.apache.hadoop.mapred.TestJobCounters > org.apache.hadoop.mapred.TestJobHistoryServer > org.apache.hadoop.mapred.TestJobInProgressListener > org.apache.hadoop.mapred.TestJobKillAndFail > org.apache.hadoop.mapred.TestJobName > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR
[ https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291028#comment-13291028 ] Mariappan Asokan commented on MAPREDUCE-2454: - The failing test seems to be a flaky one. Googling on {{org.apache.hadoop.mapred.TestReduceFetchFromPartialMem}} shows a lot of hits in mapreduce jira. I will look at the test more closely to see whether it can be fixed. I welcome input from other developers on this. Meanwhile, I can retry the same patch file to see whether this failure goes away magically. > Allow external sorter plugin for MR > --- > > Key: MAPREDUCE-2454 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Mariappan Asokan >Priority: Minor > Labels: features, performance, plugin, sort > Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, > MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, > MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, > mr-2454-on-mr-279-build82.patch.gz > > > Define interfaces and some abstract classes in the Hadoop framework to > facilitate external sorter plugins both on the Map and Reduce sides. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4320) gridmix mainClass wrong in pom.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290991#comment-13290991 ] Thomas Graves commented on MAPREDUCE-4320: -- The findbugs are known issue - see MAPREDUCE-4239 > gridmix mainClass wrong in pom.xml > -- > > Key: MAPREDUCE-4320 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4320 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/gridmix >Affects Versions: 0.23.3 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: MAPREDUCE-4320.patch > > > when trying to run gridmix its actually trying to run > org.apache.hadoop.tools.HadoopArchives. > the pom.xml needs to be fixed to have correct mainClass: > org.apache.hadoop.mapred.gridmix.Gridmix -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2980) Fetch failures and other related issues in Jetty 6.1.26
[ https://issues.apache.org/jira/browse/MAPREDUCE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290978#comment-13290978 ] Kang Xiao commented on MAPREDUCE-2980: -- hi Todd, is jetty 6.1.27 released now? Or which version you are using at present? Whe downgrade to jetty 6.1.14 but it seems that it cause tasktracker memory problem. org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint use to much memory in tasktracker. > Fetch failures and other related issues in Jetty 6.1.26 > --- > > Key: MAPREDUCE-2980 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2980 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker >Affects Versions: 0.20.205.0, 0.23.0 >Reporter: Todd Lipcon >Priority: Critical > > Since upgrading Jetty from 6.1.14 to 6.1.26 we've had a ton of HTTP-related > issues, including: > - Much higher incidence of fetch failures > - A few strange file-descriptor related bugs (eg MAPREDUCE-2389) > - A few unexplained issues where long "fsck"s on the NameNode drop out > halfway through with a ClosedChannelException > Stress tests with 1Map x 1Reduce sleep jobs reliably reproduce fetch > failures at a rate of about 1 per million on a 25 node test cluster. These > problems are all new since the upgrade from 6.1.14. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4260) Investigate use of JobObject to spawn tasks on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290915#comment-13290915 ] Ivan Mitic commented on MAPREDUCE-4260: --- This is a great change, thanks Bikas! A have a few minor questions/suggestions: 1. ProcessTree.java: {{ProcessTree.isSetsidSupported}} Do we have to check for "Create a new task..." string existence before we can enable setsid? It is not intuitive that one has to change another place in code if it changes winutils output. If you still need this check, can you please put a comment in winutils to call this out. 2. ProcessTree.java: It might be useful to log if setsid functionality is not available. 3. JobConf.java: Is there a scenario where one wouldn't want to use JobObjects on Windows? 4. task.c:197 Missing a check on whether {{LocalAlloc}} succeeded 5. task.c:201 Didn't get the motivation for doing {{ERROR_MORE_DATA}} check? Will {{procList->NumberOfAssignedProcesses}} be valid in case of this error? > Investigate use of JobObject to spawn tasks on Windows > -- > > Key: MAPREDUCE-4260 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4260 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Bikas Saha >Assignee: Bikas Saha > Attachments: MAPREDUCE-4260.branch-1-win.patch, MAPREDUCE-4260.patch, > test.cpp > > > Currently, the Windows version spawns the task as a normal cmd shell from > which other downstream exe's are spawned. However, this is not bullet proof > because if an intermediate process exits before its child exits, then the > parent child process tree relationship cannot be constructed. Windows has a > concept of JobObject that is similar to the setsid behavior used in Linux. > The initial spawned task could be launched within its JobObject. Thereafter, > process termination, memory management etc could be operated on the JobObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290898#comment-13290898 ] Hadoop QA commented on MAPREDUCE-4290: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531241/MAPREDUCE-4290.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2443//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2443//console This message is automatically generated. > JobStatus.getState() API is giving ambiguous values > --- > > Key: MAPREDUCE-4290 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Nishan Shetty >Assignee: Devaraj K > Attachments: MAPREDUCE-4290.patch > > > For failed job getState() API is giving status as SUCCEEDED if we use > JobClient.getAllJobs() for retrieving all jobs info from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4290: - Affects Version/s: 3.0.0 Status: Patch Available (was: Open) JobClient.getAllJobs() API is giving the status as SUCCEEDED even if the job is failed. While converting from application report to Job Status it is considering only the yarn application state. If the application state is finished and final status is failed, it is giving the job status as SUCCEEDED by considering only the application state. I have attached patch to address this, if the yarn application status is finished and final status is succeeded then only giving the job status as succeeded. > JobStatus.getState() API is giving ambiguous values > --- > > Key: MAPREDUCE-4290 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Nishan Shetty >Assignee: Devaraj K > Attachments: MAPREDUCE-4290.patch > > > For failed job getState() API is giving status as SUCCEEDED if we use > JobClient.getAllJobs() for retrieving all jobs info from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4290: - Attachment: MAPREDUCE-4290.patch > JobStatus.getState() API is giving ambiguous values > --- > > Key: MAPREDUCE-4290 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.0-alpha >Reporter: Nishan Shetty >Assignee: Devaraj K > Attachments: MAPREDUCE-4290.patch > > > For failed job getState() API is giving status as SUCCEEDED if we use > JobClient.getAllJobs() for retrieving all jobs info from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira