[jira] [Updated] (MAPREDUCE-6670) TestJobListCache#testEviction sometimes fails on Windows with timeout

2016-04-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6670:
--
Fix Version/s: 2.7.3
   2.8.0
 Release Note: Backport the fix to 2.7 and 2.8

> TestJobListCache#testEviction sometimes fails on Windows with timeout
> -
>
> Key: MAPREDUCE-6670
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6670
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2, 2.7.3
> Environment: OS: Windows Server 2012
> JDK: 1.7.0_79
>Reporter: Gergely Novák
>Assignee: Gergely Novák
>Priority: Minor
> Fix For: 2.8.0, 2.7.3, 2.9.0
>
> Attachments: MAPREDUCE-6670.001.patch, MAPREDUCE-6670.002.patch
>
>
> TestJobListCache#testEviction often needs more than 1000 ms to finish in 
> Windows environment. Increasing the timeout solves the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-1270) Hadoop C++ Extention

2016-04-13 Thread luoxu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luoxu  updated MAPREDUCE-1270:
--
Affects Version/s: 2.6.2

> Hadoop C++ Extention
> 
>
> Key: MAPREDUCE-1270
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1
> Environment:  hadoop linux
>Reporter: Wang Shouyan
> Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE 
> Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ 
> Extension.doc
>
>
>   Hadoop C++ extension is an internal project in baidu, We start it for these 
> reasons:
>1  To provide C++ API. We mostly use Streaming before, and we also try to 
> use PIPES, but we do not find PIPES is more efficient than Streaming. So we 
> think a new C++ extention is needed for us.
>2  Even using PIPES or Streaming, it is hard to control memory of hadoop 
> map/reduce Child JVM.
>3  It costs so much to read/write/sort TB/PB data by Java. When using 
> PIPES or Streaming, pipe or socket is not efficient to carry so huge data.
>What we want to do: 
>1 We do not use map/reduce Child JVM to do any data processing, which just 
> prepares environment, starts C++ mapper, tells mapper which split it should  
> deal with, and reads report from mapper until that finished. The mapper will 
> read record, ivoke user defined map, to do partition, write spill, combine 
> and merge into file.out. We think these operations can be done by C++ code.
>2 Reducer is similar to mapper, it was started after sort finished, it 
> read from sorted files, ivoke user difined reduce, and write to user defined 
> record writer.
>3 We also intend to rewrite shuffle and sort with C++, for efficience and 
> memory control.
>at first, 1 and 2, then 3.  
>What's the difference with PIPES:
>1 Yes, We will reuse most PIPES code.
>2 And, We should do it more completely, nothing changed in scheduling and 
> management, but everything in execution.
> *UPDATE:*
> Now you can get a test version of HCE from this link 
> http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1
> This is a full package with all hadoop source code.
> Following document "HCE InstallMenu.pdf" in attachment, you will build and 
> deploy it in your cluster.
> Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program 
> and give other specifications of the interface.
> Attachment "HCE Performance Report.pdf" gives a performance report of HCE 
> compared to Java MapRed and Pipes.
> Any comments are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-1270) Hadoop C++ Extention

2016-04-13 Thread luoxu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luoxu  updated MAPREDUCE-1270:
--
Affects Version/s: (was: 2.6.2)

> Hadoop C++ Extention
> 
>
> Key: MAPREDUCE-1270
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1
> Environment:  hadoop linux
>Reporter: Wang Shouyan
> Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE 
> Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ 
> Extension.doc
>
>
>   Hadoop C++ extension is an internal project in baidu, We start it for these 
> reasons:
>1  To provide C++ API. We mostly use Streaming before, and we also try to 
> use PIPES, but we do not find PIPES is more efficient than Streaming. So we 
> think a new C++ extention is needed for us.
>2  Even using PIPES or Streaming, it is hard to control memory of hadoop 
> map/reduce Child JVM.
>3  It costs so much to read/write/sort TB/PB data by Java. When using 
> PIPES or Streaming, pipe or socket is not efficient to carry so huge data.
>What we want to do: 
>1 We do not use map/reduce Child JVM to do any data processing, which just 
> prepares environment, starts C++ mapper, tells mapper which split it should  
> deal with, and reads report from mapper until that finished. The mapper will 
> read record, ivoke user defined map, to do partition, write spill, combine 
> and merge into file.out. We think these operations can be done by C++ code.
>2 Reducer is similar to mapper, it was started after sort finished, it 
> read from sorted files, ivoke user difined reduce, and write to user defined 
> record writer.
>3 We also intend to rewrite shuffle and sort with C++, for efficience and 
> memory control.
>at first, 1 and 2, then 3.  
>What's the difference with PIPES:
>1 Yes, We will reuse most PIPES code.
>2 And, We should do it more completely, nothing changed in scheduling and 
> management, but everything in execution.
> *UPDATE:*
> Now you can get a test version of HCE from this link 
> http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1
> This is a full package with all hadoop source code.
> Following document "HCE InstallMenu.pdf" in attachment, you will build and 
> deploy it in your cluster.
> Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program 
> and give other specifications of the interface.
> Attachment "HCE Performance Report.pdf" gives a performance report of HCE 
> compared to Java MapRed and Pipes.
> Any comments are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238955#comment-15238955
 ] 

Hadoop QA commented on MAPREDUCE-6607:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: 
patch generated 1 new + 71 unchanged - 0 fixed = 72 total (was 71) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 40s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 55s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 23s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798473/MAPREDUCE-6607.05.patch
 |
| JIRA Issue | MAPREDUCE-6607 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2e40e7adc04d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptc

[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-04-13 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238885#comment-15238885
 ] 

Kai Sasaki commented on MAPREDUCE-6607:
---

[~ajisakaa] Thank you so much for taking care. I updated the patch. Could you 
check it when you get a chance?

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch, MAPREDUCE-6607.02.patch, 
> MAPREDUCE-6607.03.patch, MAPREDUCE-6607.04.patch, MAPREDUCE-6607.05.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-04-13 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated MAPREDUCE-6607:
--
Attachment: MAPREDUCE-6607.05.patch

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch, MAPREDUCE-6607.02.patch, 
> MAPREDUCE-6607.03.patch, MAPREDUCE-6607.04.patch, MAPREDUCE-6607.05.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6672) TestTeraSort fails on Windows

2016-04-13 Thread Tibor Kiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238881#comment-15238881
 ] 

Tibor Kiss commented on MAPREDUCE-6672:
---

Thanks [~djp] for the proposed enhancement. I agree that it is not nice to make 
the testcase OS specific.

Unfortunately the proposed fix does not resolve the issue, the same error 
appears:
{noformat}
  TestTeraSort.testTeraSort:92->runTeraSort:67 ╗ IO No FileSystem for scheme: C
{noformat}

If we don't want to make distinction in the testcase based on the OS then we 
need to extend the URI/Scheme
validator to expect windows drive letters in front of the path.

Please let me know your preferred solution.

Thanks!

> TestTeraSort fails on Windows
> -
>
> Key: MAPREDUCE-6672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6672
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
> Environment: OS: Windows Server 2012
> JDK: Oracle 1.7.0_79
>Reporter: Tibor Kiss
>Assignee: Tibor Kiss
>Priority: Minor
> Attachments: MAPREDUCE-6672.01.patch
>
>
> TestTeraSort testcase fails on Windows.
> The test case uses the build directory as test working directory.
> Under Windows the build directory starts with a drive definition ( "C:" ),
> which is interpreted as (an invalid) URI scheme. 
> The fix is trivial: Add URI scheme to the beginning of the working directory.
> Error message:
> {noformat}
> Running org.apache.hadoop.examples.terasort.TestTeraSort
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.647 sec <<< 
> FAILURE! - in org.apache.hadoop.examples.terasort.TestTeraSort
> testTeraSort(org.apache.hadoop.examples.terasort.TestTeraSort)  Time elapsed: 
> 3.359 sec  <<< ERROR!
> java.io.IOException: No FileSystem for scheme: C
> at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2787)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2798)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:223)
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
> at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:179)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)
> at org.apache.hadoop.examples.terasort.TeraSort.run(TeraSort.java:331)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at 
> org.apache.hadoop.examples.terasort.TestTeraSort.runTeraSort(TestTeraSort.java:75)
> at 
> org.apache.hadoop.examples.terasort.TestTeraSort.testTeraSort(TestTeraSort.java:101)
> Results :
> Tests in error:
>   TestTeraSort.testTeraSort:101->runTeraSort:75 ╗ IO No FileSystem for 
> scheme: C
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)