[
https://issues.apache.org/jira/browse/HADOOP-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893760#action_12893760
]
Suresh Srinivas commented on HADOOP-6870:
-----------------------------------------
Sorry for posting the comments late. I was busy.
# General comment: I have concerns about recursive listing. This could be
abused by the applications, creating a lot of requests into HDFS.
# Any deletion of files/directories while reursing through directories results
in RuntimeException and application has a partial result. Should we ignore if a
directory was in {{stack}} and was not found later when iterating through it?
# FileSystem.java
#* listFile() - method javadoc could be better organized - first write about if
path is directory and two cases recursive=true and false. Then if path is file
and two cases recursive=true or false.
#* listFile() - document throwing RuntimeException,
UnsupportedOperationException and the possible cause. IOException is no longer
thrown.
# TestListFiles.java
#* testDirectory() - comments {{test empty directory}} and {{test directory
with 1 file}} should be moved up to relevant sections of the test.
> Add FileSystem#listLocatedStatus to list a directory's content together with
> each file's block locations
> --------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-6870
> URL: https://issues.apache.org/jira/browse/HADOOP-6870
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs
> Affects Versions: 0.22.0
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: listFiles.patch, listFiles1.patch, listFiles2.patch,
> listFiles3.patch, listFiles4.patch
>
>
> This jira implements the new FileSystem API as proposed in HDFS-202. The new
> API aims to eliminate individual "getFileBlockLocations" calls to NN for each
> file in the input directory of a job. Instead, a file's block locations are
> returned together with FileStatus when listing a directory, thus improving
> getSplits performance.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.