[jira] [Updated] (MAPREDUCE-6670) TestJobListCache#testEviction sometimes fails on Windows with timeout
[ https://issues.apache.org/jira/browse/MAPREDUCE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6670: -- Fix Version/s: 2.7.3 2.8.0 Release Note: Backport the fix to 2.7 and 2.8 > TestJobListCache#testEviction sometimes fails on Windows with timeout > - > > Key: MAPREDUCE-6670 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6670 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2, 2.7.3 > Environment: OS: Windows Server 2012 > JDK: 1.7.0_79 >Reporter: Gergely Novák >Assignee: Gergely Novák >Priority: Minor > Fix For: 2.8.0, 2.7.3, 2.9.0 > > Attachments: MAPREDUCE-6670.001.patch, MAPREDUCE-6670.002.patch > > > TestJobListCache#testEviction often needs more than 1000 ms to finish in > Windows environment. Increasing the timeout solves the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] luoxu updated MAPREDUCE-1270: -- Affects Version/s: 2.6.2 > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE > Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ > Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] luoxu updated MAPREDUCE-1270: -- Affects Version/s: (was: 2.6.2) > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE > Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ > Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set
[ https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238955#comment-15238955 ] Hadoop QA commented on MAPREDUCE-6607: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s {color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: patch generated 1 new + 71 unchanged - 0 fixed = 72 total (was 71) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 40s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 55s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 23s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12798473/MAPREDUCE-6607.05.patch | | JIRA Issue | MAPREDUCE-6607 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2e40e7adc04d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptc
[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set
[ https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238885#comment-15238885 ] Kai Sasaki commented on MAPREDUCE-6607: --- [~ajisakaa] Thank you so much for taking care. I updated the patch. Could you check it when you get a chance? > .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or > mapreduce.task.files.preserve.filepattern are set > --- > > Key: MAPREDUCE-6607 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.7.1 >Reporter: Maysam Yabandeh >Assignee: Kai Sasaki >Priority: Minor > Attachments: MAPREDUCE-6607.01.patch, MAPREDUCE-6607.02.patch, > MAPREDUCE-6607.03.patch, MAPREDUCE-6607.04.patch, MAPREDUCE-6607.05.patch > > > if either of the following configs are set, then .staging dir is not cleaned > up: > * mapreduce.task.files.preserve.failedtask > * mapreduce.task.files.preserve.filepattern > The former was supposed to keep only .staging of failed tasks and the latter > was supposed to be used only if that task name matches against the specified > regular expression. > {code} > protected boolean keepJobFiles(JobConf conf) { > return (conf.getKeepTaskFilesPattern() != null || conf > .getKeepFailedTaskFiles()); > } > {code} > {code} > public void cleanupStagingDir() throws IOException { > /* make sure we clean the staging files */ > String jobTempDir = null; > FileSystem fs = getFileSystem(getConfig()); > try { > if (!keepJobFiles(new JobConf(getConfig( { > jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR); > if (jobTempDir == null) { > LOG.warn("Job Staging directory is null"); > return; > } > Path jobTempDirPath = new Path(jobTempDir); > LOG.info("Deleting staging directory " + > FileSystem.getDefaultUri(getConfig()) + > " " + jobTempDir); > fs.delete(jobTempDirPath, true); > } > } catch(IOException io) { > LOG.error("Failed to cleanup staging dir " + jobTempDir, io); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set
[ https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated MAPREDUCE-6607: -- Attachment: MAPREDUCE-6607.05.patch > .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or > mapreduce.task.files.preserve.filepattern are set > --- > > Key: MAPREDUCE-6607 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.7.1 >Reporter: Maysam Yabandeh >Assignee: Kai Sasaki >Priority: Minor > Attachments: MAPREDUCE-6607.01.patch, MAPREDUCE-6607.02.patch, > MAPREDUCE-6607.03.patch, MAPREDUCE-6607.04.patch, MAPREDUCE-6607.05.patch > > > if either of the following configs are set, then .staging dir is not cleaned > up: > * mapreduce.task.files.preserve.failedtask > * mapreduce.task.files.preserve.filepattern > The former was supposed to keep only .staging of failed tasks and the latter > was supposed to be used only if that task name matches against the specified > regular expression. > {code} > protected boolean keepJobFiles(JobConf conf) { > return (conf.getKeepTaskFilesPattern() != null || conf > .getKeepFailedTaskFiles()); > } > {code} > {code} > public void cleanupStagingDir() throws IOException { > /* make sure we clean the staging files */ > String jobTempDir = null; > FileSystem fs = getFileSystem(getConfig()); > try { > if (!keepJobFiles(new JobConf(getConfig( { > jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR); > if (jobTempDir == null) { > LOG.warn("Job Staging directory is null"); > return; > } > Path jobTempDirPath = new Path(jobTempDir); > LOG.info("Deleting staging directory " + > FileSystem.getDefaultUri(getConfig()) + > " " + jobTempDir); > fs.delete(jobTempDirPath, true); > } > } catch(IOException io) { > LOG.error("Failed to cleanup staging dir " + jobTempDir, io); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6672) TestTeraSort fails on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238881#comment-15238881 ] Tibor Kiss commented on MAPREDUCE-6672: --- Thanks [~djp] for the proposed enhancement. I agree that it is not nice to make the testcase OS specific. Unfortunately the proposed fix does not resolve the issue, the same error appears: {noformat} TestTeraSort.testTeraSort:92->runTeraSort:67 ╗ IO No FileSystem for scheme: C {noformat} If we don't want to make distinction in the testcase based on the OS then we need to extend the URI/Scheme validator to expect windows drive letters in front of the path. Please let me know your preferred solution. Thanks! > TestTeraSort fails on Windows > - > > Key: MAPREDUCE-6672 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6672 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 2.8.0 > Environment: OS: Windows Server 2012 > JDK: Oracle 1.7.0_79 >Reporter: Tibor Kiss >Assignee: Tibor Kiss >Priority: Minor > Attachments: MAPREDUCE-6672.01.patch > > > TestTeraSort testcase fails on Windows. > The test case uses the build directory as test working directory. > Under Windows the build directory starts with a drive definition ( "C:" ), > which is interpreted as (an invalid) URI scheme. > The fix is trivial: Add URI scheme to the beginning of the working directory. > Error message: > {noformat} > Running org.apache.hadoop.examples.terasort.TestTeraSort > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.647 sec <<< > FAILURE! - in org.apache.hadoop.examples.terasort.TestTeraSort > testTeraSort(org.apache.hadoop.examples.terasort.TestTeraSort) Time elapsed: > 3.359 sec <<< ERROR! > java.io.IOException: No FileSystem for scheme: C > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2787) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2798) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:223) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:179) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359) > at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:331) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.examples.terasort.TestTeraSort.runTeraSort(TestTeraSort.java:75) > at > org.apache.hadoop.examples.terasort.TestTeraSort.testTeraSort(TestTeraSort.java:101) > Results : > Tests in error: > TestTeraSort.testTeraSort:101->runTeraSort:75 ╗ IO No FileSystem for > scheme: C > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)