[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841428#action_12841428
 ] 

Hudson commented on MAPREDUCE-1501:
---

Integrated in Hadoop-Mapreduce-trunk #248 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/])
. FileInputFormat supports multi-level, recursive 
directory listing.  (Zheng Shao via dhruba)


 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-03-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841658#action_12841658
 ] 

Chris Douglas commented on MAPREDUCE-1501:
--

{noformat}
+import com.sun.org.apache.commons.logging.Log;
+import com.sun.org.apache.commons.logging.LogFactory;
{noformat}
Should these imports be {{org.apache.hadoop.commons.logging}}, not 
{{com.sun...}} ?

Is there a reason this feature was only added to a deprecated class, instead of 
the {{FileInputFormat}} in the {{mapreduce}} package?

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-03-03 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840541#action_12840541
 ] 

dhruba borthakur commented on MAPREDUCE-1501:
-

The failed unit test is TestMiniMRLocalFS.testWithLocal  and is not related to 
this patch. I will commit this patch.

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841042#action_12841042
 ] 

Hudson commented on MAPREDUCE-1501:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #257 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/257/])
. FileInputFormat supports multi-level, recursive 
directory listing.  (Zheng Shao via dhruba)


 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-03-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840122#action_12840122
 ] 

Hadoop QA commented on MAPREDUCE-1501:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12436481/MAPREDUCE-1501.1.trunk.patch
  against trunk revision 916823.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/339/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/339/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/339/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/339/console

This message is automatically generated.

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-02-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836602#action_12836602
 ] 

Hadoop QA commented on MAPREDUCE-1501:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12436481/MAPREDUCE-1501.1.trunk.patch
  against trunk revision 912471.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/469/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/469/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/469/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/469/console

This message is automatically generated.

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-02-22 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836936#action_12836936
 ] 

Zheng Shao commented on MAPREDUCE-1501:
---

Thanks for the feedback Ian.
I don't think FileSystem.listPath() returns . or  ... If it does, I believe 
the current code in trunk will also break. The new unit test will also fail if 
that's the case.


 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-02-22 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836948#action_12836948
 ] 

dhruba borthakur commented on MAPREDUCE-1501:
-

I think Ian mentioned that you can enhance this feature by allowing the user to 
register a set of PathFilters. That will allow the job to process only a 
selected subset of the subdirectories.

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-02-22 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836963#action_12836963
 ] 

Zheng Shao commented on MAPREDUCE-1501:
---

Thanks Dhruba. I missed the part and other hidden directories. We do call 
PathFilter on the sub directories as well (see addInputPathRecursively(...)). 
Is that good enough or we want to split the PathFilters for files and the 
PathFilters for directories?


 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing

2010-02-22 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837030#action_12837030
 ] 

dhruba borthakur commented on MAPREDUCE-1501:
-

That should be good enough, unless Ian has some other ideas.

 FileInputFormat to support multi-level/recursive directory listing
 --

 Key: MAPREDUCE-1501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: MAPREDUCE-1501.1.branch-0.20.patch, 
 MAPREDUCE-1501.1.trunk.patch


 As we have seen multiple times in the mailing list, users want to have the 
 capability of getting all files out of a multi-level directory structure.
 4/1/2008: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e
 2/3/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e
 6/2/2009: 
 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e
 One solution that our users had is to write a new FileInputFormat, but that 
 means all existing FileInputFormat subclasses need to be changed in order to 
 support this feature.
 We can easily provide a JobConf option (which defaults to false) to 
 {{FileInputFormat.listStatus(...)}} to recursively go into directory 
 structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.