[jira] [Commented] (MAPREDUCE-5536) mapreduce.jobhistory.webapp.https.address property is not respected
[ https://issues.apache.org/jira/browse/MAPREDUCE-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783824#comment-13783824 ] Hudson commented on MAPREDUCE-5536: --- FAILURE: Integrated in Hadoop-Yarn-trunk #350 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/350/]) MAPREDUCE-5536. Fixed MR AM and JHS to respect mapreduce.jobhistory.webapp.https.address. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528251) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/client/MRClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMCommunicator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/JobBlock.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/NavBlock.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/WebAppUtil.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/dao/AMAttemptInfo.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JHAdminConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JobHistoryUtils.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRWebAppUtil.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/MapReduceTrackingUriPlugin.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsJobBlock.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/webapp/HsTaskPage.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java mapreduce.jobhistory.webapp.https.address property is not respected --- Key: MAPREDUCE-5536 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5536 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Yesha Vora Assignee: Omkar Vinit Joshi Priority: Blocker Fix For: 2.1.2-beta Attachments: MAPREDUCE-5536.20131027.1.patch, MAPREDUCE-5536.20131030.1.patch, MAPREDUCE-5536.20131030.1.patch, MAPREDUCE-5536.20131030.3-branch-2.1.patch, MAPREDUCE-5536.20131030.3.patch, YARN-1240.20131025.1.patch The jobhistory server starts on port
[jira] [Commented] (MAPREDUCE-5544) JobClient#getJob loads job conf twice
[ https://issues.apache.org/jira/browse/MAPREDUCE-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783821#comment-13783821 ] Hudson commented on MAPREDUCE-5544: --- FAILURE: Integrated in Hadoop-Yarn-trunk #350 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/350/]) MAPREDUCE-5544. JobClient#getJob loads job conf twice. (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528196) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java JobClient#getJob loads job conf twice - Key: MAPREDUCE-5544 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5544 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.1.2-beta Attachments: MAPREDUCE-5544-1.patch, MAPREDUCE-5544.patch Calling JobClient#getJob causes the job conf file to be loaded twice, once in the constructor of JobClient.NetworkedJob and once in Cluster#getJob. We should remove the former. MAPREDUCE-5001 was meant to fix a race that was causing problems in Hive tests, but the problem persists because it only fixed one of the places where the job conf file is loaded. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-4421) Run MapReduce framework via the distributed cache
[ https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783818#comment-13783818 ] Hudson commented on MAPREDUCE-4421: --- FAILURE: Integrated in Hadoop-Yarn-trunk #350 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/350/]) MAPREDUCE-4421. Run MapReduce framework via the distributed cache. Contributed by Jason Lowe (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528237) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm * /hadoop/common/trunk/hadoop-project/src/site/site.xml Run MapReduce framework via the distributed cache - Key: MAPREDUCE-4421 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Jason Lowe Fix For: 3.0.0, 2.3.0 Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421-3.patch, MAPREDUCE-4421-4.patch, MAPREDUCE-4421.patch, MAPREDUCE-4421.patch Currently MR AM depends on MR jars being deployed on all nodes via implicit dependency on YARN_APPLICATION_CLASSPATH. We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, probably, just rely on adding a shaded MR jar along with job.jar to the dist-cache. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-5549: -- Affects Version/s: 3.0.0 distcp app should fail if m/r job fails --- Key: MAPREDUCE-5549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, mrv2 Affects Versions: 3.0.0 Reporter: David Rosenstrauch I run distcpv2 in a scripted manner. The script checks if the distcp step fails and, if so, aborts the rest of the script. However, I ran into an issue today where the distcp job failed, but my calling script went on its merry way. Digging into the code a bit more (at https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java), I think I see the issue: the distcp app is not returning an error exit code to the shell when the distcp job fails. This is a big problem, IMO, as it prevents distcp from being successfully used in a scripted environment. IMO, the code should change like so: Before: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... try { execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } return DistCpConstants.SUCCESS; } //... {code} After: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... Job job = null; try { job = execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } if (job.isSuccessful()) { return DistCpConstants.SUCCESS; } else { return DistCpConstants.UNKNOWN_ERROR; } } //... {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-5549: -- Attachment: MAPREDUCE-5549-001.patch Patch with some changes from the original proposal # check for {{job.isSuccess()}} performed in try/catch block, so that IOEs get caught. # added new {{JOB_FAILED}} value; lets scripts people interpret failure better I also took the opportunity to review the run method and patched the catch of Throwable in the parse phase to print Exception.toString, not Exception.getMessage, for the usual reasons. Tests: no, none distcp app should fail if m/r job fails --- Key: MAPREDUCE-5549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, mrv2 Affects Versions: 3.0.0 Reporter: David Rosenstrauch Attachments: MAPREDUCE-5549-001.patch I run distcpv2 in a scripted manner. The script checks if the distcp step fails and, if so, aborts the rest of the script. However, I ran into an issue today where the distcp job failed, but my calling script went on its merry way. Digging into the code a bit more (at https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java), I think I see the issue: the distcp app is not returning an error exit code to the shell when the distcp job fails. This is a big problem, IMO, as it prevents distcp from being successfully used in a scripted environment. IMO, the code should change like so: Before: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... try { execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } return DistCpConstants.SUCCESS; } //... {code} After: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... Job job = null; try { job = execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } if (job.isSuccessful()) { return DistCpConstants.SUCCESS; } else { return DistCpConstants.UNKNOWN_ERROR; } } //... {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5549) distcp app should fail if m/r job fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-5549: -- Target Version/s: 3.0.0, 2.3.0 Status: Patch Available (was: Open) distcp app should fail if m/r job fails --- Key: MAPREDUCE-5549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, mrv2 Affects Versions: 3.0.0 Reporter: David Rosenstrauch Attachments: MAPREDUCE-5549-001.patch I run distcpv2 in a scripted manner. The script checks if the distcp step fails and, if so, aborts the rest of the script. However, I ran into an issue today where the distcp job failed, but my calling script went on its merry way. Digging into the code a bit more (at https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java), I think I see the issue: the distcp app is not returning an error exit code to the shell when the distcp job fails. This is a big problem, IMO, as it prevents distcp from being successfully used in a scripted environment. IMO, the code should change like so: Before: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... try { execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } return DistCpConstants.SUCCESS; } //... {code} After: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... Job job = null; try { job = execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } if (job.isSuccessful()) { return DistCpConstants.SUCCESS; } else { return DistCpConstants.UNKNOWN_ERROR; } } //... {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5549) distcp app should fail if m/r job fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784010#comment-13784010 ] Hadoop QA commented on MAPREDUCE-5549: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606282/MAPREDUCE-5549-001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-distcp. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4079//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4079//console This message is automatically generated. distcp app should fail if m/r job fails --- Key: MAPREDUCE-5549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, mrv2 Affects Versions: 3.0.0 Reporter: David Rosenstrauch Attachments: MAPREDUCE-5549-001.patch I run distcpv2 in a scripted manner. The script checks if the distcp step fails and, if so, aborts the rest of the script. However, I ran into an issue today where the distcp job failed, but my calling script went on its merry way. Digging into the code a bit more (at https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java), I think I see the issue: the distcp app is not returning an error exit code to the shell when the distcp job fails. This is a big problem, IMO, as it prevents distcp from being successfully used in a scripted environment. IMO, the code should change like so: Before: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... try { execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } return DistCpConstants.SUCCESS; } //... {code} After: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... Job job = null; try { job = execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } if (job.isSuccessful()) { return DistCpConstants.SUCCESS; } else { return DistCpConstants.UNKNOWN_ERROR; } } //... {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5530) Binary and source incompatibility in mapred.lib.CombineFileInputFormat between branch-1 and branch-2
[ https://issues.apache.org/jira/browse/MAPREDUCE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-5530: - Resolution: Fixed Fix Version/s: 2.1.2-beta Status: Resolved (was: Patch Available) I just committed this. Thanks [~rkanter]! Also, thanks to [~sandyr] and [~zjshen] for reviews and feedback. Binary and source incompatibility in mapred.lib.CombineFileInputFormat between branch-1 and branch-2 Key: MAPREDUCE-5530 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5530 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv1, mrv2 Affects Versions: 2.1.1-beta Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Fix For: 2.1.2-beta Attachments: MAPREDUCE-5530.patch, MAPREDUCE-5530.patch, MAPREDUCE-5530.patch, MAPREDUCE-5530.patch {{mapred.lib.CombineFileInputFormat}} in branch-1 has this method: {code:java} protected boolean isSplitable(FileSystem fs, Path file) {code} In branch-2, {{mapred.lib.CombineFileInputFormat}} is now a subclass of {{mapreduce.lib.input.CombineFileInputFormat}}, from which it inherits the similar method: {code:java} protected boolean isSplitable(JobContext context, Path file) {code} This means that any code that subclasses {{mapred.lib.CombineFileInputFormat}} and does not provide its own implementation of {{protected boolean isSplitable(FileSystem fs, Path file)}} will not be binary or source compatible if it tries to call {{isSplitable}} with a {{FileSystem}} argument anywhere (that is, if compiled against branch-1, it will throw a {{NoSuchMethodError}} if run against branch-2; also, it won't even compile against branch-2). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5530) Binary and source incompatibility in mapred.lib.CombineFileInputFormat between branch-1 and branch-2
[ https://issues.apache.org/jira/browse/MAPREDUCE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784127#comment-13784127 ] Hudson commented on MAPREDUCE-5530: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4517 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4517/]) MAPREDUCE-5530. Fix compat with hadoop-1 in mapred.lib.CombinFileInputFormat by re-introducing isSplittable(FileSystem, Path) api and ensuring semantic compatibility. Contributed by Robert Kanter. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528533) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java Binary and source incompatibility in mapred.lib.CombineFileInputFormat between branch-1 and branch-2 Key: MAPREDUCE-5530 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5530 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv1, mrv2 Affects Versions: 2.1.1-beta Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Fix For: 2.1.2-beta Attachments: MAPREDUCE-5530.patch, MAPREDUCE-5530.patch, MAPREDUCE-5530.patch, MAPREDUCE-5530.patch {{mapred.lib.CombineFileInputFormat}} in branch-1 has this method: {code:java} protected boolean isSplitable(FileSystem fs, Path file) {code} In branch-2, {{mapred.lib.CombineFileInputFormat}} is now a subclass of {{mapreduce.lib.input.CombineFileInputFormat}}, from which it inherits the similar method: {code:java} protected boolean isSplitable(JobContext context, Path file) {code} This means that any code that subclasses {{mapred.lib.CombineFileInputFormat}} and does not provide its own implementation of {{protected boolean isSplitable(FileSystem fs, Path file)}} will not be binary or source compatible if it tries to call {{isSplitable}} with a {{FileSystem}} argument anywhere (that is, if compiled against branch-1, it will throw a {{NoSuchMethodError}} if run against branch-2; also, it won't even compile against branch-2). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5549) distcp app should fail if m/r job fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784312#comment-13784312 ] Ravi Prakash commented on MAPREDUCE-5549: - +1. Looks good to me. Thanks Steve! distcp app should fail if m/r job fails --- Key: MAPREDUCE-5549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5549 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, mrv2 Affects Versions: 3.0.0 Reporter: David Rosenstrauch Attachments: MAPREDUCE-5549-001.patch I run distcpv2 in a scripted manner. The script checks if the distcp step fails and, if so, aborts the rest of the script. However, I ran into an issue today where the distcp job failed, but my calling script went on its merry way. Digging into the code a bit more (at https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java), I think I see the issue: the distcp app is not returning an error exit code to the shell when the distcp job fails. This is a big problem, IMO, as it prevents distcp from being successfully used in a scripted environment. IMO, the code should change like so: Before: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... try { execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } return DistCpConstants.SUCCESS; } //... {code} After: {code:title=org.apache.hadoop.tools.DistCp.java} //... public int run(String[] argv) { //... Job job = null; try { job = execute(); } catch (InvalidInputException e) { LOG.error(Invalid input: , e); return DistCpConstants.INVALID_ARGUMENT; } catch (DuplicateFileException e) { LOG.error(Duplicate files in input path: , e); return DistCpConstants.DUPLICATE_INPUT; } catch (Exception e) { LOG.error(Exception encountered , e); return DistCpConstants.UNKNOWN_ERROR; } if (job.isSuccessful()) { return DistCpConstants.SUCCESS; } else { return DistCpConstants.UNKNOWN_ERROR; } } //... {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5442) $HADOOP_MAPRED_HOME/$HADOOP_CONF_DIR setting not working on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784388#comment-13784388 ] Chuan Liu commented on MAPREDUCE-5442: -- +1. Looks good to me as well. $HADOOP_MAPRED_HOME/$HADOOP_CONF_DIR setting not working on Windows --- Key: MAPREDUCE-5442 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5442 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0, 2.1.1-beta Reporter: Yingda Chen Assignee: Yingda Chen Attachments: MAPREDUCE-5442-2.patch, MAPREDUCE-5442-3.patch, MAPREDUCE-5442.patch Currently the mapred-default.xml has mapreduce.application.classpath entry set to $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/ which is problematic on Windows since the path does not work on Windows OS. Additionally, the yarn-default.xml has yarn.application.classpath entry that has similar problem, and is currently being tracked by YARN-1138 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784417#comment-13784417 ] Nathan Roberts commented on MAPREDUCE-5102: --- Hi Aleksey, thanks for the patch. Seems testDataDrivenDBInputFormat and testDateSplitter are timezone dependent, i.e. fails unless TZ=GMT+3. Can we make the test independent of timezone? fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Aleksey Gorshkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3801) org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.testExponentialEstimator fails intermittently
[ https://issues.apache.org/jira/browse/MAPREDUCE-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-3801: -- Summary: org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.testExponentialEstimator fails intermittently (was: org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators,testExponentialEstimator fails intermittently) org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.testExponentialEstimator fails intermittently -- Key: MAPREDUCE-3801 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3801 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0 Reporter: Robert Joseph Evans Attachments: org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators-output.txt, org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.txt, TEST-org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators.xml org.apache.hadoop.mapreduce.v2.app.TestRuntimeEstimators,testExponentialEstimator fails intermittently -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5554) hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components
[ https://issues.apache.org/jira/browse/MAPREDUCE-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784438#comment-13784438 ] Sandy Ryza commented on MAPREDUCE-5554: --- +1 hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components - Key: MAPREDUCE-5554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5554 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk, 2.1.1-beta Reporter: Robert Kanter Assignee: Robert Kanter Priority: Minor Attachments: MAPREDUCE-5554.patch The hadoop-mapreduce-client-jobclient tests jar has an hdfs-site.xml in it, so if its in the classpath first, then a downstream component's tests can fail if it needs to use a different hdfs-site.xml as the one in the mapreduce jar gets picked up instead. We should remove it from the jar. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5459) Update the doc of running MRv1 examples jar on YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-5459: - Resolution: Fixed Fix Version/s: 2.1.2-beta Status: Resolved (was: Patch Available) I just committed this. Thanks [~zjshen]! Update the doc of running MRv1 examples jar on YARN --- Key: MAPREDUCE-5459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5459 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.1.2-beta Attachments: MAPREDUCE-5459.1.patch In addition to adding two env vars: HADOOP_USER_CLASSPATH_FIRST and HADOOP_CLASSPATH, we still need to add {code} property namemapreduce.job.user.classpath.first/name valuetrue/value /property {code} in mapred-site.xml to make sure that the MRv1 examples jar runs correctly on YARN. Some examples will use Java reflection to find the classes in the examples jar dynamically when they are running. With this configuration, the MRv1 examples jar will appear before the MRv2 examples jar in CLASSPATH of the processes in YARN containers. Therefore, the classes found via reflection will be picked from MRv1 examples jar instead of MRv2 examples jar as well. MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm needs to be updated to document this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5459) Update the doc of running MRv1 examples jar on YARN
[ https://issues.apache.org/jira/browse/MAPREDUCE-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784442#comment-13784442 ] Hudson commented on MAPREDUCE-5459: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4519 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4519/]) MAPREDUCE-5459. Update documentation on how to run MRv1 examples on YARN. Contributed by Zhijie Shen. (acmurthy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528626) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm Update the doc of running MRv1 examples jar on YARN --- Key: MAPREDUCE-5459 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5459 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.1.2-beta Attachments: MAPREDUCE-5459.1.patch In addition to adding two env vars: HADOOP_USER_CLASSPATH_FIRST and HADOOP_CLASSPATH, we still need to add {code} property namemapreduce.job.user.classpath.first/name valuetrue/value /property {code} in mapred-site.xml to make sure that the MRv1 examples jar runs correctly on YARN. Some examples will use Java reflection to find the classes in the examples jar dynamically when they are running. With this configuration, the MRv1 examples jar will appear before the MRv2 examples jar in CLASSPATH of the processes in YARN containers. Therefore, the classes found via reflection will be picked from MRv1 examples jar instead of MRv2 examples jar as well. MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm needs to be updated to document this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5554) hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components
[ https://issues.apache.org/jira/browse/MAPREDUCE-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784518#comment-13784518 ] Hudson commented on MAPREDUCE-5554: --- SUCCESS: Integrated in Hadoop-trunk-Commit #4521 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4521/]) MAPREDUCE-5554. hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components (Robert Kanter via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528643) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components - Key: MAPREDUCE-5554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5554 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk, 2.1.1-beta Reporter: Robert Kanter Assignee: Robert Kanter Priority: Minor Attachments: MAPREDUCE-5554.patch The hadoop-mapreduce-client-jobclient tests jar has an hdfs-site.xml in it, so if its in the classpath first, then a downstream component's tests can fail if it needs to use a different hdfs-site.xml as the one in the mapreduce jar gets picked up instead. We should remove it from the jar. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5554) hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components
[ https://issues.apache.org/jira/browse/MAPREDUCE-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5554: -- Resolution: Fixed Fix Version/s: 2.1.2-beta Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Robert. I just committed this to trunk, branch-2, and branch-2.1-beta hdfs-site.xml included in hadoop-mapreduce-client-jobclient tests jar is breaking tests for downstream components - Key: MAPREDUCE-5554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5554 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: trunk, 2.1.1-beta Reporter: Robert Kanter Assignee: Robert Kanter Priority: Minor Fix For: 2.1.2-beta Attachments: MAPREDUCE-5554.patch The hadoop-mapreduce-client-jobclient tests jar has an hdfs-site.xml in it, so if its in the classpath first, then a downstream component's tests can fail if it needs to use a different hdfs-site.xml as the one in the mapreduce jar gets picked up instead. We should remove it from the jar. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5547) Job history should not be flushed to JHS until AM gets unregistered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784643#comment-13784643 ] Jian He commented on MAPREDUCE-5547: The problem being that the history sever can copy the history data to the done_intermediate directory and then unregister fails. Then the AM is relaunched, but user already see the finished status of the job in history sever bq. If the history server has already moved it from done_intermediate to done then the history server could either re-update the history with the new copy in done_intermediate or simply delete the redundant copy in done_intermediate. we can do this, but user still see the finished status of the job after the 1st AM unregisters, but just that the status will be updated until the next AM finishes. bq. If we unregister before copying the history data to the done_intermediate directory then the client could try to query the history server before the AM has had a chance to copy the jhist file. Yes, the job in history sever may be missing for some time, but it can eventually show up after the history data is copied to done_intermediate. Job history should not be flushed to JHS until AM gets unregistered --- Key: MAPREDUCE-5547 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5547 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5489) MR jobs hangs as it does not use the node-blacklisting feature in RM requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-5489: --- Status: Patch Available (was: Open) MR jobs hangs as it does not use the node-blacklisting feature in RM requests - Key: MAPREDUCE-5489 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5489 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Attachments: MAPREDUCE-5489.1.patch When RM restarted, if during restart one NM went bad (bad disk), NM got blacklisted by AM and RM keeps giving the containers on the same node even though AM doesn't want it there. Need to change AM to specifically blacklist node in the RM requests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5489) MR jobs hangs as it does not use the node-blacklisting feature in RM requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-5489: --- Attachment: MAPREDUCE-5489.1.patch I've created the patch to make AM send blacklist nodes to RM. Basically the logical is described as follows: 1. Add blacklistAdditions and blacklistRemovals to remember the blacklisted nodes added or removed between two allocate calls. The two collections will be sent to RM in upcoming allocate call. 2. Whenever a container fails on a host, the host will be blacklisted, and will add to blacklistAdditions if blacklist is not ignored. 3. When changing from not ignoring blacklist to ignoring, we added all the blacklist nodes to blacklistRemovals. 4. When changing from ignoring blacklist to not ignoring, we added all the blacklist nodes to blacklistAdditions. 5. Switching between ignoring and not ignoring blacklist nodes will not effect until the upcoming allocate call, but anyway, it will effect eventually. Test cases have been modified test whether RM is aware of the blacklisted nodes. MR jobs hangs as it does not use the node-blacklisting feature in RM requests - Key: MAPREDUCE-5489 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5489 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Attachments: MAPREDUCE-5489.1.patch When RM restarted, if during restart one NM went bad (bad disk), NM got blacklisted by AM and RM keeps giving the containers on the same node even though AM doesn't want it there. Need to change AM to specifically blacklist node in the RM requests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5489) MR jobs hangs as it does not use the node-blacklisting feature in RM requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784701#comment-13784701 ] Hadoop QA commented on MAPREDUCE-5489: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606496/MAPREDUCE-5489.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4080//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4080//console This message is automatically generated. MR jobs hangs as it does not use the node-blacklisting feature in RM requests - Key: MAPREDUCE-5489 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5489 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Yesha Vora Assignee: Zhijie Shen Attachments: MAPREDUCE-5489.1.patch When RM restarted, if during restart one NM went bad (bad disk), NM got blacklisted by AM and RM keeps giving the containers on the same node even though AM doesn't want it there. Need to change AM to specifically blacklist node in the RM requests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (MAPREDUCE-5555) distcp should infer optimal number of mappers
Rob Weltman created MAPREDUCE-: -- Summary: distcp should infer optimal number of mappers Key: MAPREDUCE- URL: https://issues.apache.org/jira/browse/MAPREDUCE- Project: Hadoop Map/Reduce Issue Type: New Feature Components: distcp Reporter: Rob Weltman Rather than requiring the user to calculate and provide an optimal number of mappers with the -m option, distcp should (if the option is not provided) be able to estimate a reasonable number. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-4710) Add peak memory usage counter for each task
[ https://issues.apache.org/jira/browse/MAPREDUCE-4710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cindy Li updated MAPREDUCE-4710: Attachment: mapreduce4710.patch Rebased patch to latest trunk. Add peak memory usage counter for each task --- Key: MAPREDUCE-4710 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4710 Project: Hadoop Map/Reduce Issue Type: New Feature Components: task Affects Versions: 1.0.2 Reporter: Cindy Li Assignee: Cindy Li Priority: Minor Labels: patch Attachments: mapreduce-4710.patch, mapreduce4710.patch, MAPREDUCE-4710-trunk.patch, mapreduce-4710-v1.0.2.patch Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which are snapshots of memory usage of that task. They are not sufficient for users to understand peak memory usage by that task, e.g. in order to diagnose task failures, tune job parameters or change application design. This new feature will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and VIRTUAL_MEMORY_BYTES_MAX. -- This message was sent by Atlassian JIRA (v6.1#6144)