[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win

2013-04-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635634#comment-13635634
 ] 

Bikas Saha commented on MAPREDUCE-5161:
---

Patch looks like a clean merge of MAPREDUCE-1806. Its not clear whether it 
reverts the independent fix in branch-1-win that is mentioned in the 
description?

 CombineFileInputFormat fix for paths not on default FS merge from branch-1 to 
 branch-1-win
 --

 Key: MAPREDUCE-5161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-5161-branch-1-win.1.patch


 MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with 
 paths that are not on the default file system.  This same bug was fixed 
 independently on branch-1-win.  The code was slightly different, but 
 equivalent to the branch-1 fix.  This jira will apply the branch-1 fix to 
 branch-1-win to keep the 2 code lines in agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win

2013-04-18 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635668#comment-13635668
 ] 

Chris Nauroth commented on MAPREDUCE-5161:
--

{quote}
Its not clear whether it reverts the independent fix in branch-1-win that is 
mentioned in the description?
{quote}

Yes, this patch reverts that fix so that branch-1 and branch-1-win are 
identical for this logic.  For reference, I've included a diff below showing 
the earlier fix that was made straight to branch-1-win, so you can compare.  
The branch-1 version is preferable and includes more tests.

{code}
diff --git src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java 
src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java
index c55df11..c439bad 100644
--- src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java
+++ src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java
@@ -194,7 +194,7 @@ public abstract class CombineFileInputFormatK, V
   continue;
 }
 FileSystem fs = paths[i].getFileSystem(job);
-Path p = new Path(paths[i].toUri().getPath());
+Path p = new Path(paths[i].toString());
 if (onepool.accept(p)) {
   myPaths.add(paths[i]); // add it to my output set
   paths[i] = null;   // already processed
diff --git 
src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java 
src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
index c9fa549..c7929e4 100644
--- src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
+++ src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
@@ -211,7 +211,7 @@ public abstract class CombineFileInputFormatK, V
 // times, one time each for each pool in the next loop.
 ListPath newpaths = new LinkedListPath();
 for (int i = 0; i  paths.length; i++) {
-  Path p = new Path(paths[i].toUri().getPath());
+  Path p = new Path(paths[i].toString());
   newpaths.add(p);
 }
 paths = null;
diff --git 
src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java 
src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java
index 8f7c4be..f013bb8 100644
--- src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java
+++ src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java
@@ -462,7 +462,8 @@ public class TestCombineFileInputFormat extends TestCase{
 // returns true if the specified path matches the prefix stored
 // in this TestFilter.
 public boolean accept(Path path) {
-  if (path.toString().indexOf(p.toString()) == 0) {
+  Path uriPath = new Path(path.toUri().getPath());
+  if (uriPath.toString().indexOf(p.toString()) == 0) {
 return true;
   }
   return false;
diff --git 
src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java 
src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java
index c80c70d..16345bd 100644
--- 
src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java
+++ 
src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java
@@ -1122,7 +1122,8 @@ public class TestCombineFileInputFormat extends TestCase {
 // returns true if the specified path matches the prefix stored
 // in this TestFilter.
 public boolean accept(Path path) {
-  if (path.toString().indexOf(p.toString()) == 0) {
+  Path uriPath = new Path(path.toUri().getPath());
+  if (uriPath.toString().indexOf(p.toString()) == 0) {
 return true;
   }
   return false;
{code}


 CombineFileInputFormat fix for paths not on default FS merge from branch-1 to 
 branch-1-win
 --

 Key: MAPREDUCE-5161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-5161-branch-1-win.1.patch


 MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with 
 paths that are not on the default file system.  This same bug was fixed 
 independently on branch-1-win.  The code was slightly different, but 
 equivalent to the branch-1 fix.  This jira will apply the branch-1 fix to 
 branch-1-win to keep the 2 code lines in agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win

2013-04-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635841#comment-13635841
 ] 

Bikas Saha commented on MAPREDUCE-5161:
---

Then I dont understand why some diff show up? e.g. 
From the diff in your comment above
{code}
 FileSystem fs = paths[i].getFileSystem(job);
-Path p = new Path(paths[i].toUri().getPath());
+Path p = new Path(paths[i].toString());
 if (onepool.accept(p)) {
{code}

From the attached patch.
{code}
 FileSystem fs = paths[i].getFileSystem(job);
-Path p = new Path(paths[i].toUri().getPath());
+Path p = fs.makeQualified(paths[i]);
 if (onepool.accept(p)) {
{code}

Shouldnt I see?
{code}
 FileSystem fs = paths[i].getFileSystem(job);
-Path p = new Path(paths[i].toString());
+Path p = fs.makeQualified(paths[i]);
 if (onepool.accept(p)) {
{code}


 CombineFileInputFormat fix for paths not on default FS merge from branch-1 to 
 branch-1-win
 --

 Key: MAPREDUCE-5161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-5161-branch-1-win.1.patch


 MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with 
 paths that are not on the default file system.  This same bug was fixed 
 independently on branch-1-win.  The code was slightly different, but 
 equivalent to the branch-1 fix.  This jira will apply the branch-1 fix to 
 branch-1-win to keep the 2 code lines in agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win

2013-04-18 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635868#comment-13635868
 ] 

Chris Nauroth commented on MAPREDUCE-5161:
--

Sorry for the confusion.  It turns out that the diff in my comment above came 
from a personal branch.  The independent fix I described never actually got 
committed to branch-1-win, so really, this is just a simple merge of the 
branch-1 fix to branch-1-win.

I've updated the ticket description to state that this is a merge (and not a 
revert of a prior independent fix).

Thanks!


 CombineFileInputFormat fix for paths not on default FS merge from branch-1 to 
 branch-1-win
 --

 Key: MAPREDUCE-5161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-5161-branch-1-win.1.patch


 MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with 
 paths that are not on the default file system.  This jira will merge the 
 branch-1 fix to branch-1-win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira