[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy
[ https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-5867: --- Issue Type: Sub-task (was: Bug) Parent: MAPREDUCE-4584 Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy -- Key: MAPREDUCE-5867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.3.0 Reporter: Sunil G Assignee: Sunil G Attachments: MapReduce-5867-updated.patch, MapReduce-5867.2.patch, MapReduce-5867.3.patch, Yarn-1980.1.patch I configured KillAMPreemptionPolicy for My Application Master and tried to check preemption of queues. In one scenario I have seen below NPE in my AM 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267) at java.lang.Thread.run(Thread.java:662) I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5889: - Target Version/s: 2.5.0 Release Note: Deprecate o.a.h.mapreduce.lib.input.setInputPaths(Job, String) and o.a.h.mapreduce.lib.input.addInputPaths(Job, String). Use setInputPaths(Job, Path...) and addInputPaths(Job, Path...) instead. Status: Patch Available (was: Open) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String) --- Key: MAPREDUCE-5889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie Attachments: MAPREDUCE-5889.patch {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(Job job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts
[ https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-5652: Resolution: Fixed Fix Version/s: 2.5.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the contribution and patience with multiple reviews, Jason. Just committed this to trunk and branch-2. NM Recovery. ShuffleHandler should handle NM restarts - Key: MAPREDUCE-5652 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Jason Lowe Labels: shuffle Fix For: 2.5.0 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch ShuffleHandler should work across NM restarts and not require re-running map-tasks. On NM restart, the map outputs are cleaned up requiring re-execution of map tasks and should be avoided. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4071) NPE while executing MRAppMaster shutdown hook
[ https://issues.apache.org/jira/browse/MAPREDUCE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996832#comment-13996832 ] Chen He commented on MAPREDUCE-4071: ping NPE while executing MRAppMaster shutdown hook - Key: MAPREDUCE-4071 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4071 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.3, 2.0.0-alpha, trunk Reporter: Bhallamudi Venkata Siva Kamesh Attachments: MAPREDUCE-4071-1.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071-2.patch, MAPREDUCE-4071.patch While running the shutdown hook of MRAppMaster, hit NPE {noformat} Exception in thread Thread-1 java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.setSignalled(MRAppMaster.java:668) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$MRAppMasterShutdownHook.run(MRAppMaster.java:1004) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA reassigned MAPREDUCE-5889: Assignee: Akira AJISAKA Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String) --- Key: MAPREDUCE-5889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(JobConf conf, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(JobConf conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... inputPaths)}} instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5887) Move split creation from submission client to MRAppMaster
[ https://issues.apache.org/jira/browse/MAPREDUCE-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997431#comment-13997431 ] Steve Loughran commented on MAPREDUCE-5887: --- This is good, especially when the client is something like a laptop trying to submit to in-cloud deployments. One test to try there is what happens when the blocksize is reported as very, very small (you can configure this in swiftfs). in the client this will cause the submitting process to OOM and fail. Presumably the same outcome in the AM is the simplest to implement -we just need to make sure that YARN recognises this as a failure and only tries a couple of times Move split creation from submission client to MRAppMaster - Key: MAPREDUCE-5887 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5887 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: MAPREDUCE-5887.v01.patch This JIRA is filed to improve scalability of job submission, specifically when there is a significant latency between the submission client and the cluster nodes RM and NN, e.g. in a multi-datacenter environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996732#comment-13996732 ] Hadoop QA commented on MAPREDUCE-5888: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12644640/MAPREDUCE-5888.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4599//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4599//console This message is automatically generated. Failed job leaves hung AM after it unregisters --- Key: MAPREDUCE-5888 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.2.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5888.patch When a job fails the AM hangs during shutdown. A non-daemon thread pool executor thread prevents the JVM teardown from completing, and the AM lingers on the cluster for the AM expiry interval in the FINISHING state until eventually the RM expires it and kills the container. If application limits on the queue are relatively low (e.g.: small queue or small cluster) this can cause unnecessary delays in resource scheduling on the cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5889: - Description: {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(Job job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} instead. was: {{FileInputFormat.setInputPaths(JobConf conf, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(JobConf conf, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(JobConf conf, Path... inputPaths)}} and {{addInputPaths(JobConf conf, Path... inputPaths)}} instead. Summary: Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String) (was: Deprecate FileInputFormat.setInputPaths(JobConf, String) and FileInputFormat.addInputPaths(JobConf, String)) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String) --- Key: MAPREDUCE-5889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(Job job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997088#comment-13997088 ] Gera Shegalov commented on MAPREDUCE-5886: -- Chris, thanks for the JDK pointer, I am aware of the behavior. Allow wordcount example job to accept multiple input paths. --- Key: MAPREDUCE-5886 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, MAPREDUCE-5886.3.patch It would be convenient if the wordcount example MapReduce job could accept multiple input paths and run the word count on all of them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5889: - Attachment: MAPREDUCE-5889.patch Attaching a patch to deprecate these methods and to fix javac warnings. Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String) --- Key: MAPREDUCE-5889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: newbie Attachments: MAPREDUCE-5889.patch {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail to parse commaSeparatedPaths if a comma is included in the file path. (e.g. Path: {{/path/file,with,comma}}) We should deprecate these methods and document to use {{setInputPaths(Job job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated MAPREDUCE-5309: -- Description: When the 2.0.4 JobHistoryParser tries to parse a job history file generated by hadoop 2.0.3, the jobhistoryparser throws as an error as java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters at org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58) at org.apache.avro.generic.GenericData.setField(GenericData.java:463) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) at org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142) at com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Test code and the job history file are attached. Test code: package com.twitter.somepackagel; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser; import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo; import org.junit.Test; import org.apache.hadoop.yarn.YarnException; public class Test20JobHistoryParsing { @Test public void testFileAvro() throws IOException { Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist); JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new Configuration()), local_path2); try { JobInfo ji2 = parser2.parse(); System.out.println( job info: + ji2.getJobname() + + ji2.getFinishedMaps() + + ji2.getTotalMaps() + + ji2.getJobId() ) ; } catch (IOException e) { throw new YarnException(Could not load history file + local_path2.getName(), e); } } } This seems to stem from the fix in https://issues.apache.org/jira/browse/MAPREDUCE-4693 that added counters to the historyserver for failed tasks. This breaks backward compatibility with JobHistoryServer. was: When the 2.0.4 JobHistoryParser tries to parse a job history file generated by hadoop 2.0.3, the
[jira] [Updated] (MAPREDUCE-5814) fat jar with *-default.xml may fail when mapreduce.job.classloader=true.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5814: -- Resolution: Fixed Fix Version/s: 2.5.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks, Gera! I committed this to trunk and branch-2. fat jar with *-default.xml may fail when mapreduce.job.classloader=true. Key: MAPREDUCE-5814 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5814 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5814.v01.patch, MAPREDUCE-5814.v02.patch, MAPREDUCE-5814.v03.patch We faced a failure when a job.jar compiled against 0.20+ hadoop artifacts had to run with {{mapreduce.job.classloader=true}} because it needed a more recent guava as a dependency. The job failed because the cluster's {{*-default.xml}} files were overshadowed by the ones in the fat jar. We propose to treat these default config files like the system packages {{org.apache.hadoop.}} to avoid a counterintuitivie behavior as if we had {{mapreduce.job.user.classpath.first}} set. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5888: -- Attachment: MAPREDUCE-5888.patch Quick patch to fix the issue. Manually tested it with a fail job and saw that the MRAppMaster hung after unregistering without the change and does not hang with the patch. Failed job leaves hung AM after it unregisters --- Key: MAPREDUCE-5888 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.2.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5888.patch When a job fails the AM hangs during shutdown. A non-daemon thread pool executor thread prevents the JVM teardown from completing, and the AM lingers on the cluster for the AM expiry interval in the FINISHING state until eventually the RM expires it and kills the container. If application limits on the queue are relatively low (e.g.: small queue or small cluster) this can cause unnecessary delays in resource scheduling on the cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996622#comment-13996622 ] Jonathan Eagles commented on MAPREDUCE-5888: pending Hadoop QA +1 Failed job leaves hung AM after it unregisters --- Key: MAPREDUCE-5888 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.2.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5888.patch When a job fails the AM hangs during shutdown. A non-daemon thread pool executor thread prevents the JVM teardown from completing, and the AM lingers on the cluster for the AM expiry interval in the FINISHING state until eventually the RM expires it and kills the container. If application limits on the queue are relatively low (e.g.: small queue or small cluster) this can cause unnecessary delays in resource scheduling on the cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5637) Convert Hadoop Streaming document to APT
[ https://issues.apache.org/jira/browse/MAPREDUCE-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992835#comment-13992835 ] Hudson commented on MAPREDUCE-5637: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1777 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1777/]) MAPREDUCE-5637. Convert Hadoop Streaming document to APT (Akira AJISAKA via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1592789) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/HadoopStreaming.apt.vm * /hadoop/common/trunk/hadoop-project/src/site/site.xml Convert Hadoop Streaming document to APT Key: MAPREDUCE-5637 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5637 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: documentation Affects Versions: 2.2.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5637.2.patch, MAPREDUCE-5637.patch Convert Hadoop Streaming document from forrest to APT. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy
[ https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-5867: - Status: Open (was: Patch Available) Thanks Sunil for the updated patch with the test case. Please find some comments for the added test. * make all the instance/static variables as private in the test class * can you cover the test with these cases ** with strictContract as null ** with strictContract as null ** with contract as null ** with contract as not null * It may not require to have those many no of containers for preempting * remove SOP in setup() Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy -- Key: MAPREDUCE-5867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Sunil G Assignee: Sunil G Attachments: MapReduce-5867.2.patch, MapReduce-5867.3.patch, Yarn-1980.1.patch I configured KillAMPreemptionPolicy for My Application Master and tried to check preemption of queues. In one scenario I have seen below NPE in my AM 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267) at java.lang.Thread.run(Thread.java:662) I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5885) build/test/test.mapred.spill causes release audit warnings
[ https://issues.apache.org/jira/browse/MAPREDUCE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated MAPREDUCE-5885: --- Attachment: MAPREDUCE-5885.patch patch submitted. build/test/test.mapred.spill causes release audit warnings -- Key: MAPREDUCE-5885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5885 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk Reporter: Jason Lowe Assignee: Chen He Attachments: MAPREDUCE-5885.patch Multiple unit tests are creating files under hadoop-mapreduce-client-jobclient/build/test/test.mapred.spill which are causing release audit warnings during Jenkins patch precommit builds. In addition to being in a poor location for test output and not cleaning up after the test, there are multiple tests using this location which will cause conflicts if tests are run in parallel. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5465) Container killed before hprof dumps profile.out
[ https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993652#comment-13993652 ] Jason Lowe commented on MAPREDUCE-5465: --- The release audit warnings are unrelated, filed MAPREDUCE-5885. The TestPipeApplication timeout is also unrelated, see MAPREDUCE-5868. Thanks for updating the patch, Ming! Sorry for the long delay in getting back to this. I've been thinking about the performance implications of this change. I'm wondering if we should treat the finishing states as if they're the corresponding completed states from external entities (i.e.: task/job). We would send T_ATTEMPT_SUCCEEDED or T_ATTEMPT_FAILED and set task finish times to the time the attempt said it succeeded or failed rather than the time the container completed. Similarly we would map the internal finishing states to their respective external SUCCEEDED/FAILED state rather than RUNNING. From the task/job perspective they're not particularly interested in when the attempt exits, rather they only care about when the task says it's output is available. This would allow the task and job to react to success/failure transitions in the same timeframe that it does today, so there should be a minimal performance impact. The only impact would be if the container needs to complete to free up enough space for the next task's container to be allocated, and in most cases the task will complete quick enough that the AM will receive the new container in the same heartbeat that it used to before this change. Actually this may end up being slightly faster than what it does today, since today it connects to the NM and sends the kill command before it considers the task completed. This proposal would have the task complete as soon as the task indicated via the umbilical. Other comments on the latest patch: - Rather than have the finishing states call the cleanup container transition and have that transition have to special-case being called by finishing states, it'd be cleaner to factor out the common code from the cleanup container transition that they're trying to leverage and call that instead. Transitions doing state or event checks usually means somethings a bit off, since the transition should already know what event triggered it and what state(s) it applies to. - Similarly, the timeout transitions should have dedicated transition code that not only warns in the AM log but also sets an attempt diagnostic message. It can re-use some/all of the cleanup container transition so it's not replicating code. With the diagnostic it will be much more likely the user will be aware of the timeout issue and fix their task code. Tasks that timeout during finishing can still succeed, so users probably won't even know something went wrong unless they bother to examine the AM log and happen to notice it. - This change looks like some accidental reformatting: {noformat} --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java @@ -222,7 +222,7 @@ public void run() { // remember the current attempt futures.put(event.getTaskAttemptID(), future); -} else if (event.getType() == EventType.CONTAINER_REMOTE_CLEANUP) { + } else if (event.getType() == EventType.CONTAINER_REMOTE_CLEANUP) { // cancel (and interrupt) the current running task associated with the // event {noformat} - Nit: a sendContainerCompleted utility method to send the CONTAINER_COMPLETED event would be nice - Nit: code should be formatted to 80 columns, comments for the state transitions in particular. Container killed before hprof dumps profile.out --- Key: MAPREDUCE-5465 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: trunk, 2.0.3-alpha Reporter: Radim Kolar Assignee: Ming Ma Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, MAPREDUCE-5465-4.patch, MAPREDUCE-5465-5.patch, MAPREDUCE-5465-6.patch, MAPREDUCE-5465.patch If there is profiling enabled for mapper or reducer then hprof dumps profile.out at process exit. It is dumped after task signaled to AM that work is finished. AM kills container with finished work without waiting for hprof to finish dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 works) , it could not finish dump in time before being killed making entire dump unusable because cpu and heap stats are
[jira] [Commented] (MAPREDUCE-5888) Failed job leaves hung AM after it unregisters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996590#comment-13996590 ] Tsuyoshi OZAWA commented on MAPREDUCE-5888: --- +1 (non-binding). Failed job leaves hung AM after it unregisters --- Key: MAPREDUCE-5888 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5888 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.2.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-5888.patch When a job fails the AM hangs during shutdown. A non-daemon thread pool executor thread prevents the JVM teardown from completing, and the AM lingers on the cluster for the AM expiry interval in the FINISHING state until eventually the RM expires it and kills the container. If application limits on the queue are relatively low (e.g.: small queue or small cluster) this can cause unnecessary delays in resource scheduling on the cluster. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Willis updated MAPREDUCE-5018: - Attachment: MAPREDUCE-5018.patch New patch with tests Support raw binary data with Hadoop streaming - Key: MAPREDUCE-5018 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/streaming Reporter: Jay Hacker Priority: Minor Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, MAPREDUCE-5018.patch, justbytes.jar, mapstream People often have a need to run older programs over many files, and turn to Hadoop streaming as a reliable, performant batch system. There are good reasons for this: 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and it is easy to spin up a cluster in the cloud. 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs. 3. It is reasonably performant: it moves the code to the data, maintaining locality, and scales with the number of nodes. Historically Hadoop is of course oriented toward processing key/value pairs, and so needs to interpret the data passing through it. Unfortunately, this makes it difficult to use Hadoop streaming with programs that don't deal in key/value pairs, or with binary data in general. For example, something as simple as running md5sum to verify the integrity of files will not give the correct result, due to Hadoop's interpretation of the data. There have been several attempts at binary serialization schemes for Hadoop streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed at efficiently encoding key/value pairs, and not passing data through unmodified. Even the RawBytes serialization scheme adds length fields to the data, rendering it not-so-raw. I often have a need to run a Unix filter on files stored in HDFS; currently, the only way I can do this on the raw data is to copy the data out and run the filter on one machine, which is inconvenient, slow, and unreliable. It would be very convenient to run the filter as a map-only job, allowing me to build on existing (well-tested!) building blocks in the Unix tradition instead of reimplementing them as mapreduce programs. However, most existing tools don't know about file splits, and so want to process whole files; and of course many expect raw binary input and output. The solution is to run a map-only job with an InputFormat and OutputFormat that just pass raw bytes and don't split. It turns out to be a little more complicated with streaming; I have attached a patch with the simplest solution I could come up with. I call the format JustBytes (as RawBytes was already taken), and it should be usable with most recent versions of Hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5874) Creating MapReduce REST API section
[ https://issues.apache.org/jira/browse/MAPREDUCE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995939#comment-13995939 ] Akira AJISAKA commented on MAPREDUCE-5874: -- +1 (non-binding) Creating MapReduce REST API section --- Key: MAPREDUCE-5874 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5874 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.4.0 Reporter: Ravi Prakash Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-5874.2.patch, MAPREDUCE-5874.3.patch, YARN-1999.1.patch Now that we have the YARN HistoryServer, perhaps we should move HistoryServerRest.apt.vm and MapRedAppMasterRest.apt.vm into the MapReduce section where it really belongs? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996830#comment-13996830 ] Chris Nauroth commented on MAPREDUCE-5886: -- Hi, [~jira.shegalov] and [~ajisakaa]. Thanks for looking at this and contributing some new ideas. Regarding {{FileInputFormat#addInputPaths}}, in addition to the issue raised by Akira for supporting comma in a file name, there is another reason why I didn't use that method. On Windows Command Prompt, the comma acts as an argument separator, much like space. This would have the potential to create confusion for users on Windows. The basic concept of the new API looks good to me. We might instead consider passing varargs and no range indices. Word count could chop the input args down to the correct range using {{Arrays#copyOfRange}} or {{List#subList}}. Would you mind moving all of the API work to another jira? MAPREDUCE-5889 probably would work for that. For this issue, I was hoping to put in a quick trivial patch in just word count to enable this. IOW, I'd like to pursue a binding +1 on patch v1 and commit it. Thanks again! Allow wordcount example job to accept multiple input paths. --- Key: MAPREDUCE-5886 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch, MAPREDUCE-5886.3.patch It would be convenient if the wordcount example MapReduce job could accept multiple input paths and run the word count on all of them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5886) Allow wordcount example job to accept multiple input paths.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995682#comment-13995682 ] Hadoop QA commented on MAPREDUCE-5886: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12644305/MAPREDUCE-5886.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-examples. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4597//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4597//console This message is automatically generated. Allow wordcount example job to accept multiple input paths. --- Key: MAPREDUCE-5886 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5886 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: MAPREDUCE-5886.1.patch, MAPREDUCE-5886.2.patch It would be convenient if the wordcount example MapReduce job could accept multiple input paths and run the word count on all of them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5774) Job overview in History UI should list reducer phases in chronological order
[ https://issues.apache.org/jira/browse/MAPREDUCE-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998207#comment-13998207 ] Hudson commented on MAPREDUCE-5774: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/]) MAPREDUCE-5774. Job overview in History UI should list reducer phases in chronological order. (Gera Shegalov via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593890) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsJobBlock.java Job overview in History UI should list reducer phases in chronological order Key: MAPREDUCE-5774 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5774 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Trivial Fix For: 2.5.0 Attachments: MAPREDUCE-5774.v01.patch Current order: Average Map Time 9sec Average Reduce Time0sec Average Shuffle Time 22sec Average Merge Time 0sec Proposed order: Average Map Time 9sec Average Shuffle Time 22sec Average Merge Time 0sec Average Reduce Time0sec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5652) NM Recovery. ShuffleHandler should handle NM restarts
[ https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998208#comment-13998208 ] Hudson commented on MAPREDUCE-5652: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1753/]) MAPREDUCE-5652. NM Recovery. ShuffleHandler should handle NM restarts. (Jason Lowe via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594329) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/LICENSE.txt * /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/pom.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/proto/ShuffleHandlerRecovery.proto * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestShuffleHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/pom.xml NM Recovery. ShuffleHandler should handle NM restarts - Key: MAPREDUCE-5652 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Jason Lowe Labels: shuffle Fix For: 2.5.0 Attachments: MAPREDUCE-5652-v10.patch, MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch, MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch, MAPREDUCE-5652-v7.patch, MAPREDUCE-5652-v8.patch, MAPREDUCE-5652-v9-and-YARN-1987.patch, MAPREDUCE-5652.patch ShuffleHandler should work across NM restarts and not require re-running map-tasks. On NM restart, the map outputs are cleaned up requiring re-execution of map tasks and should be avoided. -- This message was sent by Atlassian JIRA (v6.2#6252)