[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win
[ https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635634#comment-13635634 ] Bikas Saha commented on MAPREDUCE-5161: --- Patch looks like a clean merge of MAPREDUCE-1806. Its not clear whether it reverts the independent fix in branch-1-win that is mentioned in the description? CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win -- Key: MAPREDUCE-5161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5161-branch-1-win.1.patch MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with paths that are not on the default file system. This same bug was fixed independently on branch-1-win. The code was slightly different, but equivalent to the branch-1 fix. This jira will apply the branch-1 fix to branch-1-win to keep the 2 code lines in agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win
[ https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635668#comment-13635668 ] Chris Nauroth commented on MAPREDUCE-5161: -- {quote} Its not clear whether it reverts the independent fix in branch-1-win that is mentioned in the description? {quote} Yes, this patch reverts that fix so that branch-1 and branch-1-win are identical for this logic. For reference, I've included a diff below showing the earlier fix that was made straight to branch-1-win, so you can compare. The branch-1 version is preferable and includes more tests. {code} diff --git src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java index c55df11..c439bad 100644 --- src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java +++ src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java @@ -194,7 +194,7 @@ public abstract class CombineFileInputFormatK, V continue; } FileSystem fs = paths[i].getFileSystem(job); -Path p = new Path(paths[i].toUri().getPath()); +Path p = new Path(paths[i].toString()); if (onepool.accept(p)) { myPaths.add(paths[i]); // add it to my output set paths[i] = null; // already processed diff --git src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java index c9fa549..c7929e4 100644 --- src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java +++ src/mapred/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java @@ -211,7 +211,7 @@ public abstract class CombineFileInputFormatK, V // times, one time each for each pool in the next loop. ListPath newpaths = new LinkedListPath(); for (int i = 0; i paths.length; i++) { - Path p = new Path(paths[i].toUri().getPath()); + Path p = new Path(paths[i].toString()); newpaths.add(p); } paths = null; diff --git src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java index 8f7c4be..f013bb8 100644 --- src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java +++ src/test/org/apache/hadoop/mapred/lib/TestCombineFileInputFormat.java @@ -462,7 +462,8 @@ public class TestCombineFileInputFormat extends TestCase{ // returns true if the specified path matches the prefix stored // in this TestFilter. public boolean accept(Path path) { - if (path.toString().indexOf(p.toString()) == 0) { + Path uriPath = new Path(path.toUri().getPath()); + if (uriPath.toString().indexOf(p.toString()) == 0) { return true; } return false; diff --git src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java index c80c70d..16345bd 100644 --- src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java +++ src/test/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java @@ -1122,7 +1122,8 @@ public class TestCombineFileInputFormat extends TestCase { // returns true if the specified path matches the prefix stored // in this TestFilter. public boolean accept(Path path) { - if (path.toString().indexOf(p.toString()) == 0) { + Path uriPath = new Path(path.toUri().getPath()); + if (uriPath.toString().indexOf(p.toString()) == 0) { return true; } return false; {code} CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win -- Key: MAPREDUCE-5161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5161-branch-1-win.1.patch MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with paths that are not on the default file system. This same bug was fixed independently on branch-1-win. The code was slightly different, but equivalent to the branch-1 fix. This jira will apply the branch-1 fix to branch-1-win to keep the 2 code lines in agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win
[ https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635841#comment-13635841 ] Bikas Saha commented on MAPREDUCE-5161: --- Then I dont understand why some diff show up? e.g. From the diff in your comment above {code} FileSystem fs = paths[i].getFileSystem(job); -Path p = new Path(paths[i].toUri().getPath()); +Path p = new Path(paths[i].toString()); if (onepool.accept(p)) { {code} From the attached patch. {code} FileSystem fs = paths[i].getFileSystem(job); -Path p = new Path(paths[i].toUri().getPath()); +Path p = fs.makeQualified(paths[i]); if (onepool.accept(p)) { {code} Shouldnt I see? {code} FileSystem fs = paths[i].getFileSystem(job); -Path p = new Path(paths[i].toString()); +Path p = fs.makeQualified(paths[i]); if (onepool.accept(p)) { {code} CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win -- Key: MAPREDUCE-5161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5161-branch-1-win.1.patch MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with paths that are not on the default file system. This same bug was fixed independently on branch-1-win. The code was slightly different, but equivalent to the branch-1 fix. This jira will apply the branch-1 fix to branch-1-win to keep the 2 code lines in agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5161) CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win
[ https://issues.apache.org/jira/browse/MAPREDUCE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13635868#comment-13635868 ] Chris Nauroth commented on MAPREDUCE-5161: -- Sorry for the confusion. It turns out that the diff in my comment above came from a personal branch. The independent fix I described never actually got committed to branch-1-win, so really, this is just a simple merge of the branch-1 fix to branch-1-win. I've updated the ticket description to state that this is a merge (and not a revert of a prior independent fix). Thanks! CombineFileInputFormat fix for paths not on default FS merge from branch-1 to branch-1-win -- Key: MAPREDUCE-5161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5161 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5161-branch-1-win.1.patch MAPREDUCE-1806 fixed a bug related to use of {{CombineFileInputFormat}} with paths that are not on the default file system. This jira will merge the branch-1 fix to branch-1-win. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira